Reproduction and Analysis of leukemia_mobilenet.ipynb
Dataset and Source

The notebook leukemia_mobilenet.ipynb uses the Leukemia Classification Dataset published on Kaggle:

Dataset: Leukemia Classification Dataset

Source: https://www.kaggle.com/datasets/andrewmvd/leukemia-classification

This dataset contains segmented and masked nuclei extracted from peripheral blood smear images. Labels correspond to a binary classification task distinguishing ALL (Acute Lymphoblastic Leukemia) nuclei from HEM (benign hematogone) nuclei.

The dataset is widely used in demonstration notebooks for leukemia classification but represents a highly curated and preprocessed view of the problem.

Model Architecture

The reproduced model is implemented in PyTorch and is built on top of a pretrained MobileNetV2 backbone using the timm library. Specifically, the architecture employs the mobilenetv2_100 variant as a feature extractor.

Key architectural details include:

Pretrained MobileNetV2 convolutional base

Replacement of the original classifier head

A custom fully connected classification head consisting of:

Linear layer reducing feature dimensionality

ReLU activations

Dropout layers for regularization

Final linear layer producing two outputs (ALL vs HEM)

Softmax activation for class probability estimation

This configuration follows a standard transfer-learning paradigm, leveraging pretrained representations while adapting the classifier to a binary medical imaging task.

Platform Adaptation and Execution Challenges

Reproducing this notebook on macOS required substantial modification. The original implementation contained multiple CUDA-specific assumptions, including hard-coded .cuda() calls and device logic incompatible with Apple Silicon.

Key adaptations included:

Replacing CUDA-specific calls with dynamic device selection

Ensuring compatibility with the MPS backend

Auditing tensor placement to prevent silent CPU–GPU mismatches

After these changes, the model was successfully executed on macOS using MPS acceleration.

Training Configuration and Evaluation

To reduce runtime during reproduction, the number of training epochs was reduced to 10 epochs, rather than the longer schedules typically used in benchmark notebooks.

Despite the shorter training schedule, the model achieved strong performance on the held-out test set.

Confusion Matrix
[[2044   96]
 [ 234  825]]

Classification Report

ALL
Precision: 0.90
Recall: 0.96
F1-score: 0.93

HEM
Precision: 0.90
Recall: 0.78
F1-score: 0.83
Overall Accuracy: 0.90

These results indicate effective discrimination between leukemic and benign nuclei under the assumptions imposed by the dataset.

Observations and Limitations

While performance metrics are strong, several important limitations were identified:

Highly Curated Input Representation
The model operates on pre-segmented, masked nuclei rather than raw peripheral blood smear images. This removes significant sources of real-world variability, such as overlapping cells, staining artifacts, and platelet interference.

Narrow Disease Scope
The classification task is limited to ALL vs HEM, excluding other leukemia subtypes and myeloid lineage abnormalities. As such, the approach does not generalize to broader hematologic screening tasks.

Pipeline Dependency on Prior Segmentation
The model’s success depends on the availability of accurate nuclei masks. In a clinical pipeline, this would require an additional segmentation model whose errors would directly affect classification reliability.

Relevance to the Capstone Project

This notebook provides a strong example of how modern lightweight architectures, such as MobileNetV2, can achieve high performance on leukemia classification when trained on curated nuclei datasets. It demonstrates the effectiveness of transfer learning and carefully designed classifier heads for binary medical imaging tasks.

However, it also underscores a central motivation of the Capstone project: high accuracy on curated nuclei does not equate to robust clinical applicability. The Capstone explicitly targets end-to-end realism by incorporating raw smear context, platelet-aware categorization, and broader lineage coverage, rather than relying solely on masked nuclei and narrowly defined disease labels.

Summary

The leukemia_mobilenet.ipynb notebook was successfully reproduced on macOS after resolving CUDA-specific implementation constraints. The model achieved strong classification performance despite reduced training time. Nevertheless, its reliance on pre-segmented nuclei and its focus on a single leukemia subtype limit its clinical generalizability. The insights gained from this reproduction reinforce the importance of dataset realism and task definition, both of which are central to the Capstone project’s design philosophy.