<a href="https://colab.research.google.com/github/gamesMum/Leukemia-Diagnostics/blob/master/Leukemia_Diagnosis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Leukemia Diagnosis**

**Classification of Acute Leukemia using Pretrained Deep Convolutional Neural Networks**
Based on the implementation in the paper:

[**Human-level recognition of blast cells in acute myeloid
leukemia with convolutional neural networks**](https://www.biorxiv.org/content/10.1101/564039v1.full.pdf)

 **The Dataset used in this implementation:**


- The dataset is for AML instead of ALL
-The number of subtypes are 16
including the Normal class  

- link to the dataset https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=61080958




# **Materials and Methods**

- peripheral blood smears were selected from 100 patients diagnosed with different subtypes
of AML at the Laboratory of Leukemia Diagnostics at Munich University Hospital between 2014 and 2017, and smears from 100 patients found to exhibit no morphological
features of hematological malignancies in the same time frame.

- The resulting digitised data consisted of multiresolution pyramidal images of a size of approximately 1 GB per scanned area of interest.
A trained examiner experienced in routine cytomorphological diagnostics at Munich University Hospital differentiated physiological and pathological leukocyte types contained
in the microscopic scans into the classification scheme (see fig 2B),
which is derived from standard morphological categories and was refined to take into account subcategories relevant for the morphological classification of AML, such as bilobed Promyelocytes, which are typical of the FAB subtype M3v.
-  Annotation was carried out on a
single-cell basis, and approximately 100 cells were differentiated in each smear
- Subimage patches of size 400 x 400 pixels (corresponding to approximately 29µm x 29µm)
around the annotated cells were extracted without further cropping or filtering, including
background components such as erythrocytes, platelets or cell fragments.
- When examining the screened blood smears, the cytologist followed the routine clinical procedure.
Overall, 18,365 single-cell images were annotated and cut out of the scan regions.

- Annotations of single-cell images provide the ground truth for training and evaluation
of our network.

- Morphological classes containing fewer than 10 images were merged with
neighbouring classes of the taxonomy.

- A subset of 1,905 single-cell images from all morphological categories were presented to a second, independent examiner, and annotated
for a second time in order to estimate inter-rater variability

**For Classification:**
- ResNeXt CNN was used
-






In [0]:
import torch
from torch import nn
from torchvision import datasets, transforms, models

In [0]:
#list of all models in torchvision
dir(models)

In [4]:
#Check if CUDA is available
train_on_gpu = torch.cuda.is_available()
if not train_on_gpu:
  print('CUDA is not available. Training on CPU...')
else:
  print('CUDA is available! training on GPU...')

CUDA is available! training on GPU...


In [0]:
#prepare the data
from torch.utils.data.sampler import SubsetRandomSampler
import torchvision as tv
batch_size = 32
valid_size = 0.2

# mean and std values are specified in https://pytorch.org/hub/pytorch_vision_resnet/
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

#define the transforms
train_transform  = transforms.Compose([transforms.Resize((400,400)),
                                       transforms.RandomRotation(359),
                                       transforms.RandomHorizontalFlip(0.2),
                                       transforms.ToTensor(),
                                       transforms.Normalize(mean=mean, std=std)])

test_transforms = transforms.Compose([transforms.Resize((400, 400)),
                                      transforms.ToTensor(),
                                      transforms.Normalize(mean=mean, std=std)])






classes = ['BAS', 'EBO', 'EOS', 'KSC', 'LYA', 'LYT', 'MMZ', 'MOB', 'MON', 'MYB', 'MYO',
           'NGB', 'NGS', 'PMB', 'PMO', 'NORM']

In [11]:
#Load AlexNet pretrained model
resNetModel = models.resnext101_32x8d(pretrained=True)
resNetModel

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1

In [0]:
#freeze the model calssifier
for param in resNetModel.parameters():
  param.requires_grad = False

from collections import OrderedDict

classifier = nn.Sequential(OrderedDict([
                          ('fc1', nn.Linear(1000, 1000)),
                          ('relu', nn.ReLU()),
                          ('dropout', nn.Dropout(0.5)),
                          ('fc2', nn.Linear(1000, 512)),
                          ('relu', nn.ReLU()),
                          ('dropout', nn.Dropout(0.5)),
                          ('fc2', nn.Linear(512, 15)),
                          ('output', nn.LogSoftmax(dim=1) )                                       
                                      ]))

resNetModel.classifier = classifier

In [18]:
from torch import optim
#Loss function and optmixation function
criterion = nn.NLLLoss()
optimizer = torch.optim.SGD(resNetModel.parameters(), lr=0.01)
if train_on_gpu:
  resNetModel.cuda()

resNetModel

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1

In [0]:
 #training the model
 