<a href="https://colab.research.google.com/github/petre001/PET_Biomarkers/blob/main/notebooks/setup.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Predicting PET Biomarkers of Alzheimer’s Disease With MRI Using Deep Convolutional Neural Networks 
### Contributors: Jeffrey Petrella

This project uses transfer learning to train a ResNet18 model to identify amyloid PET biomarker status from MRI images. It should be run on a GPU

### Step 1: Link Notebook to GitHub

In [1]:
# Remove Colab default sample_data
!rm -r ./sample_data

# Clone GitHub files to colab workspace
repo_name = "PET_Biomarkers" # Enter repo name
git_path = 'https://github.com/petre001/PET_Biomarkers.git'
!git clone "{git_path}"

Cloning into 'PET_Biomarkers'...
remote: Enumerating objects: 160, done.[K
remote: Counting objects: 100% (157/157), done.[K
remote: Compressing objects: 100% (90/90), done.[K
remote: Total 160 (delta 77), reused 130 (delta 61), pack-reused 3[K
Receiving objects: 100% (160/160), 1.36 MiB | 12.64 MiB/s, done.
Resolving deltas: 100% (77/77), done.


### Step 2: Install and Import Dependencies

In [3]:
# Change working directory to Git Repo
%cd "{repo_name}"

[Errno 2] No such file or directory: 'PET_Biomarkers'
/content/PET_Biomarkers


In [4]:
# Install dependencies from requirements.txt file
!pip install -r "{'requirements.txt'}"

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting monai
  Downloading monai-0.9.1-202207251608-py3-none-any.whl (990 kB)
[K     |████████████████████████████████| 990 kB 4.6 MB/s 
Collecting pydicom
  Downloading pydicom-2.3.0-py3-none-any.whl (2.0 MB)
[K     |████████████████████████████████| 2.0 MB 60.7 MB/s 
Collecting streamlit
  Downloading streamlit-1.11.1-py2.py3-none-any.whl (9.1 MB)
[K     |████████████████████████████████| 9.1 MB 47.9 MB/s 
Collecting pydeck>=0.1.dev5
  Downloading pydeck-0.7.1-py2.py3-none-any.whl (4.3 MB)
[K     |████████████████████████████████| 4.3 MB 47.0 MB/s 
[?25hCollecting blinker>=1.0.0
  Downloading blinker-1.5-py2.py3-none-any.whl (12 kB)
Collecting gitpython!=3.1.19
  Downloading GitPython-3.1.27-py3-none-any.whl (181 kB)
[K     |████████████████████████████████| 181 kB 54.8 MB/s 
Collecting toml
  Downloading toml-0.10.2-py2.py3-none-any.whl (16 kB)
Collecting rich>=10.11.0
  Down

In [5]:
!git clone https://github.com/Project-MONAI/MONAI.git
%cd MONAI/
!pip install -e '.[all]'

Cloning into 'MONAI'...
remote: Enumerating objects: 26594, done.[K
remote: Counting objects: 100% (266/266), done.[K
remote: Compressing objects: 100% (191/191), done.[K
remote: Total 26594 (delta 125), reused 166 (delta 75), pack-reused 26328[K
Receiving objects: 100% (26594/26594), 49.31 MiB | 34.87 MiB/s, done.
Resolving deltas: 100% (20768/20768), done.
/content/PET_Biomarkers/MONAI
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Obtaining file:///content/PET_Biomarkers/MONAI
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone
Collecting itk>=5.2
  Downloading itk-5.2.1.post1-cp37-cp37m-manylinux2014_x86_64.whl (8.3 kB)
Collecting imagecodecs
  Downloading imagecodecs-2021.11.20-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.0 MB)
[K     |████████████████████████████████| 31.0 MB 1.2 MB/s 
[?25hCollect

In [6]:
import os
import urllib.request
import tarfile
import zipfile
import copy
import time
import numpy as np
import pandas as pd
from torchsummary import summary
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

import pydicom
import cv2
from PIL import Image

import torch
from torchvision import datasets, transforms
import torchvision
from torch.utils.data import DataLoader, Dataset
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim


In [7]:
import monai
from monai.data import DataLoader, ImageDataset, NumpyReader
from monai.transforms import AddChannel, Compose, RandRotate90, Resize, ScaleIntensity, EnsureType

pin_memory = torch.cuda.is_available()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
CUDA_VERSION = torch.__version__.split("+")[-1]
print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION)

torch:  1.12 ; cuda:  cu113


### Step 3: Load Training and Test data

In [8]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [9]:
%cd /content/drive/MyDrive/test_mci_data

/content/drive/MyDrive/test_mci_data


In [10]:
# Create a list of images and labels
df = pd.read_csv('MCI_labels.csv')
images = df.iloc[:,0].to_list()
images = [i+'.npy' for i in images]
labels = df.iloc[:,10].to_list()

In [11]:
# Define transforms
train_transforms = Compose([ScaleIntensity(), AddChannel(), Resize((96, 96, 96)), RandRotate90(), EnsureType()])
val_transforms = Compose([ScaleIntensity(), AddChannel(), Resize((96, 96, 96)), EnsureType()])

# Create training Dataset and DataLoader using first 171 images
batch_size = 2

train_ds = ImageDataset(image_files=images[:170], labels=labels[:170], transform=train_transforms, reader='NumpyReader')
train_loader = DataLoader(train_ds, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=torch.cuda.is_available())

# Create validation Dataset and DataLoader using the rest of the 21 images
val_ds = ImageDataset(image_files=images[171:], labels=labels[171:], transform=val_transforms, reader='NumpyReader')
val_loader = DataLoader(val_ds, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=torch.cuda.is_available())

# Set up dict for dataloaders
dataloaders = {'train':train_loader,'val':val_loader}

# Store size of training, validation and test sets
dataset_sizes = {'train':len(train_ds),'val':len(val_ds)}

In [12]:
im, label = monai.utils.misc.first(train_loader)
print(f'Image type: {type(im)}')
print(f'Input batch shape: {im.shape}')
print(f'Label batch shape: {label.shape}')

Image type: <class 'monai.data.meta_tensor.MetaTensor'>
Input batch shape: (2, 1, 96, 96, 96)
Label batch shape: torch.Size([2])


In [13]:
# Set up a mapping dictionary
classes = ['Amyloid(-)','Amyloid(+)']
idx_to_class = {i:j for i,j in enumerate(classes)}
class_to_idx = {v:k for k,v in idx_to_class.items()}

###Step 4: Define our model architecture
We will used a pre-trained DenseNet 121 model for this task.

In [14]:
# Load a pre-trained DenseNet121
# We have a signle input channel, and we have 2 output classes
# We set spatial_dims=3 to indicate we want to use the version suitable for 3D input images
model = monai.networks.nets.DenseNet121(spatial_dims=3, in_channels=1, out_channels=2).to(device)

### Step 5: Train the Model

In [15]:
def train_model(model, criterion, optimizer, dataloaders, device, num_epochs=5):

    model = model.to(device) # Send model to GPU if available

    iter_num = {'train':0,'val':0} # Track total number of iterations

    best_metric = -1

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Get the input images and labels, and send to GPU if available
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # Zero the weight gradients
                optimizer.zero_grad()

                # Forward pass to get outputs and calculate loss
                # Track gradient only for training data
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # Backpropagation to get the gradients with respect to each weight
                    # Only if in train
                    if phase == 'train':
                        loss.backward()
                        # Update the weights
                        optimizer.step()

                # Convert loss into a scalar and add it to running_loss
                running_loss += loss.item() * inputs.size(0)
                # Track number of correct predictions
                running_corrects += torch.sum(preds == labels.data)

                # Iterate count of iterations
                iter_num[phase] += 1

            # Calculate and display average loss and accuracy for the epoch
            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]
            print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

            # Save weights if accuracy is best
            if phase=='val':
                if epoch_acc > best_metric:
                    best_metric = epoch_acc
                    if not os.path.exists('./models'):
                        os.mkdir('./models')
                    torch.save(model.state_dict(),'models/3d_classification_model.pth')
                    print('Saved best new model')

    print(f'Training complete. Best validation set accuracy was {best_metric}')
    
    return

In [16]:
# Use cross-entropy loss function
criterion = torch.nn.CrossEntropyLoss()
# loss_function = torch.nn.BCEWithLogitsLoss()  # also works with this data

# Use Adam adaptive optimizer
optimizer = torch.optim.Adam(model.parameters(), 1e-4)

# Train the model
epochs=5
train_model(model, criterion, optimizer, dataloaders, device, num_epochs=epochs)

Epoch 0/4
----------
train Loss: 0.6876 Acc: 0.5941
val Loss: 0.6374 Acc: 0.6667
Saved best new model
Epoch 1/4
----------
train Loss: 0.6656 Acc: 0.6412
val Loss: 0.6545 Acc: 0.5714
Epoch 2/4
----------
train Loss: 0.6802 Acc: 0.6059
val Loss: 0.6031 Acc: 0.6667
Epoch 3/4
----------
train Loss: 0.6813 Acc: 0.6412
val Loss: 0.6219 Acc: 0.6667
Epoch 4/4
----------
train Loss: 0.6517 Acc: 0.6471
val Loss: 0.5567 Acc: 0.6667
Training complete. Best validation set accuracy was 0.6666666666666666
