# COGS 188 - Leukemia Classifier Project Proposal

# Project Description

Our team is developing a medical AI system for early Leukemia detection that addresses a critical real-world challenge in healthcare. The project is working with over 3,000 blood cell images and complex image data. We're implementing a complete deep learning pipeline with model selection, uncertainty quantification, and multiple evaluation metrics. Rather than a toy problem, we're tackling a system that could have real clinical impact.

# Names

- Zain Khatri
- Mohsin Khawaja
- Syed Usmani
- Wilson Zhu

# Abstract 
Our project aims to develop an interpretable deep learning system for early detection and subtype classification of Acute Lymphoblastic Leukemia (ALL). We will be working with a dataset of 3,242 peripheral blood smear images collected from Taleqani Hospital, containing expert-validated labels for benign and three malignant subtypes (Early Pre-B, Pre-B, and Pro-B ALL). We're building a custom CNN architecture that moves beyond simple classification by incorporating uncertainty quantification and visual explanations to help medical professionals understand and trust its decisions. Performance will be measured using not just accuracy metrics, but also through uncertainty calibration and interpretability measures to ensure real clinical utility.



# Background

Our research focuses on Acute Lymphoblastic Leukemia (ALL), an aggressive blood cancer requiring rapid diagnosis for optimal treatment outcomes<a name="terwilliger"></a>[<sup>[1]</sup>](#terwilligernote). While manual microscopic examination remains the standard practice, it faces significant challenges with studies showing inter-observer variability rates up to 30%<a name="labati"></a>[<sup>[2]</sup>](#labatinote).

We've found that deep learning approaches show promise in addressing these challenges, with recent studies achieving over 90% accuracy in leukemia cell classification<a name="rehman"></a>[<sup>[3]</sup>](#rehmannote). However, our research indicates a crucial gap - developing systems that provide interpretable results that clinicians can trust and incorporate into their decision-making process<a name="tonekaboni"></a>[<sup>[4]</sup>](#tonekabininote).

A key insight driving our approach is that uncertainty quantification in medical AI systems has emerged as crucial for clinical adoption, helping identify cases requiring additional expert review<a name="begoli"></a>[<sup>[5]</sup>](#begolinote).


# Problem Statement

We're developing a deep learning system that will:
1. Classify blood cell images into four categories (benign, Early Pre-B, Pre-B, and Pro-B ALL)
2. Provide uncertainty estimates for each classification
3. Generate visual explanations highlighting relevant cell features
4. Identify borderline cases requiring expert review

Our system must achieve:
- Classification accuracy comparable to expert pathologists (>90%)
- Reliable uncertainty estimates correlating with prediction errors
- Interpretable visual explanations that align with medical knowledge
- Real-time processing capability (<2 seconds per image)


# Data

We're working with the Blood Cells Cancer (ALL) dataset:
- Source: Taleqani Hospital (Tehran, Iran), available on Kaggle
- Size: 3,242 peripheral blood smear (PBS) images from 89 patients
- Distribution:
 * Benign: 512 images
 * Pre-B: 955 images
 * Pro-B: 796 images
 * Early Pre-B: 979 images
- Image specifications:
 * Format: JPG
 * Magnification: 100x
 * Captured using: Zeiss microscope camera
 * Expert-validated labels using flow cytometry

Our preprocessing pipeline includes:
- Image normalization and standardization
- Data augmentation (rotation, scaling, color jittering)
- Train/validation/test split (maintaining patient-wise separation)
- Class balancing techniques


# Proposed Solution

We're implementing a custom CNN architecture with three main components:

1. Base Architecture:
- ResNet50 backbone pretrained on ImageNet
- Custom head layers for multi-class classification
- Monte Carlo dropout layers for uncertainty estimation

2. Interpretability Layer:
- Grad-CAM implementation for visual explanations
- Attention mechanisms to highlight relevant cell features

3. Uncertainty Quantification:
- Ensemble of 5 models with different initializations
- Monte Carlo dropout sampling (50 forward passes)
- Calibrated probability estimates

For our benchmark, we're using a VGG16 with standard classification head, which is a common baseline in medical imaging.


# Evaluation Metrics

We'll evaluate our system using three categories of metrics:

1. Classification Performance:
- Multi-class accuracy
- Per-class precision, recall, F1-scores
- Confusion matrix analysis
- ROC curves and AUC for each class

2. Uncertainty Quality:
- Expected calibration error
- Prediction interval coverage probability
- Uncertainty-error correlation

3. Interpretability Measures:
- Localization accuracy of highlighted regions
- Expert evaluation of visual explanations
- Correlation with known diagnostic features


# Ethics & Privacy

We've identified several critical ethical considerations:

1. Patient Privacy
- We'll ensure all images remain anonymized and securely handled
- Our system deployment will comply with HIPAA regulations

2. Clinical Impact
- We're carefully considering false negative risks that could delay treatment
- We're addressing false positive impacts that could lead to unnecessary procedures
- Our system will clearly communicate uncertainty levels

3. Bias Mitigation
- We'll analyze dataset demographic representation
- Implement regular performance audits across patient subgroups
- Clearly document system limitations


# Team Expectations 

* We'll maintain daily communication through Discord/Slack
* All code changes require review before merging to main branch
* We commit to 24-hour maximum response times
* Work will be distributed equally with clear ownership
* Weekly sync meetings to track progress

# Project Timeline Proposal

| Meeting Date | Meeting Time | Completed Before Meeting | Discuss at Meeting |
|-------------|-------------|---------------------------|---------------------|
| **2/14/25** | 6 PM        | Initial research and dataset exploration (**All Members**) | Finalize architecture, discuss preprocessing |
| **2/21/25** | 6 PM        | Basic model implementation (**Zain & Syed**) | Review first results, plan improvements |
| **2/28/25** | 6 PM        | Uncertainty quantification (**Mohsin & Wilson**) | Evaluate metrics, optimization strategy |
| **3/7/25**  | 6 PM        | Interpretability features (**All Members**) | Final review, prepare documentation |
| **3/14/25** | 6 PM        | Documentation and testing (**All Members**) | Final presentation prep |
| **3/19/25** | Before 11:59 PM | NA | Submit Final Project |

# Footnotes
<a name="terwilligernote"></a>1.[^](#terwilliger): Terwilliger, T., & Abdul-Hay, M. (2017). Acute lymphoblastic leukemia: a comprehensive review and 2017 update. Blood Cancer Journal, 7(6), e577.<br>
<a name="labatinote"></a>2.[^](#labati): Labati, R. D., et al. (2011). ALL-IDB: The acute lymphoblastic leukemia image database for image processing. IEEE International Conference on Image Processing.<br>
<a name="rehmannote"></a>3.[^](#rehman): Rehman, A., et al. (2020). Classification of acute lymphoblastic leukemia using deep learning. Microscopy Research and Technique, 83(11), 1365-1378.<br>
<a name="tonekabininote"></a>4.[^](#tonekaboni): Tonekaboni, S., et al. (2019). What Clinicians Want: Contextualizing Explainable Machine Learning for Clinical End Use. Machine Learning for Healthcare Conference.<br>
<a name="begolinote"></a>5.[^](#begoli): Begoli, E., et al. (2019). The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence, 1(1), 20-23.<br>
