# CNN using Tensorflow Keras on MRI Image Data - failed attempt (memory out)

## Data Use Agreements
The data used for this project were provided in part by OASIS and ADNI.

OASIS-3: Principal Investigators: T. Benzinger, D. Marcus, J. Morris; NIH P50 AG00561, P30 NS09857781, P01 AG026276, P01 AG003991, R01 AG043434, UL1 TR000448, R01 EB009352. AV-45 doses were provided by Avid Radiopharmaceuticals, a wholly owned subsidiary of Eli Lilly.

Data collection for this project was done through the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

## Setup

### General Imports

In [4]:
import nibabel.freesurfer.mghformat as mgh

from tqdm.notebook import tqdm

import os, sys, shutil

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
import random
pd.set_option('display.max_colwidth', None)

### Set up and test Tensorflow

In [5]:
import tensorflow as tf

In [6]:
# # Hide the gpu so that it uses CPU, because GPU has OOM
# os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

# if tf.test.gpu_device_name():
#     print('GPU found')
# else:
#     print("No GPU found")

In [7]:
print(f"Tensor Flow Version: {tf.__version__}")
print(f"Keras Version: {tf.keras.__version__}")
print()
print(f"Python {sys.version}")
print(f"Pandas {pd.__version__}")
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Tensor Flow Version: 2.5.0
Keras Version: 2.5.0

Python 3.8.10 (default, Jun  4 2021, 15:09:15) 
[GCC 7.5.0]
Pandas 1.2.4
Num GPUs Available:  0


In [5]:
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)
sess.as_default()

<contextlib._GeneratorContextManager at 0x7fb6c52de5b0>

In [6]:
# tf.debugging.set_log_device_placement(True)

# # Create some tensors
# a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
# b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
# c = tf.matmul(a, b)

# print(c)

## Load data

### MRI
Since all the files are already transformed via the freesurfer, I don't think we'll need to do any major preprocessing like cropping, flipping, or rotating.
```
main_directory/
    control/
        mr_id_001/
            brain_image_001.mgz
            brain_image_001_transformed.mgz
            talairach_001.xfm
        mr_id_002/
            brain_image_002.mgz
            brain_image_002_transformed.mgz
            talairach_002.xfm
    dementia/
        mr_id_003/
            brain_image_003.mgz
            brain_image_003_transformed.mgz
            talairach_003.xfm
        mr_id_004/
            brain_image_004.mgz
            brain_image_004_transformed.mgz
            talairach_004.xfm
```

#### Define some flags for general use

In [7]:
# This class allows you to access dictionary items with a dot
# Gathered from here: https://stackoverflow.com/questions/2352181/how-to-use-a-dot-to-access-members-of-dictionary
class dotdict(dict):
    """dot.notation access to dictionary attributes"""
    __getattr__ = dict.get
    __setattr__ = dict.__setitem__
    __delattr__ = dict.__delitem__

In [8]:
FLAGS = {
    'scan_width'  : 256,
    'scan_height' : 256,
    'scan_depth'  : 256,
    'data_dir'    : '/home/jack/Code/GitHub/Polygence/Data/OASIS/mri_data',
    'num_class'   : 4
}
FLAGS = dotdict(FLAGS)
print(FLAGS)

{'scan_width': 256, 'scan_height': 256, 'scan_depth': 256, 'data_dir': '/home/jack/Code/GitHub/Polygence/Data/OASIS/mri_data', 'num_class': 4}


#### Generate the filenames
The returned filenames list will be organized as follows:
```
[[path_to_scan_1, label_for_scan_1],
 [path_to_scan_2, label_for_scan_2],
 [path_to_scan_3, label_for_scan_3],
 ...
 [path_to_scan_n, label_for_scan_n]]

```

In [9]:
def generate_filenames(labels=['control', 'dementia'], random_state=1337):
    pairs = []
    
    for label in labels:
        label_dir = os.path.join(FLAGS.data_dir, label)
        mr_ids = os.listdir(label_dir)
        mr_ids.sort()
        
        for mr_id in tqdm(mr_ids, desc=label):
            scans = os.path.join(label_dir, mr_id)
            img_file = [file for file in os.listdir(scans) if "transformed" in file]
            img_path = os.path.join(scans, img_file[0])
            i = 1 if label == 'dementia' else 0
            pairs.append([img_path, i])
            
    random.Random(random_state).shuffle(pairs)
    
    m = len(pairs)
    filenames = []
    labels = np.zeros((m, 1), dtype='int32')
    
    idx = 0
    for filename, label in pairs:
        filenames.append(filename)
        labels[idx, 0] = label
        idx += 1
    
    filenames = np.array(filenames)
    return filenames, labels
        

X_filenames, y_labels = generate_filenames(random_state=1337)

control:   0%|          | 0/712 [00:00<?, ?it/s]

dementia:   0%|          | 0/310 [00:00<?, ?it/s]

In [10]:
# Save these to load them later, in case we need it
np.save('npy_files/X_filenames.npy', X_filenames)
np.save('npy_files/y_labels.npy', y_labels)

### Split into training and testing set

In [11]:
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split

In [12]:
X_filenames_shuffled, y_labels_shuffled = shuffle(X_filenames, y_labels, random_state=1337)

In [13]:
X_train_filenames, X_test_filenames, y_train, y_test = train_test_split(
    X_filenames_shuffled, y_labels_shuffled, test_size=.2, random_state=1337)

In [14]:
print(X_train_filenames.shape, y_train.shape)
print(">> train >> ", X_train_filenames[0], y_train[0])
print(X_test_filenames.shape, y_test.shape)
print(">> test >> ", X_test_filenames[5], y_test[5])

(817,) (817, 1)
>> train >>  /home/jack/Code/GitHub/Polygence/Data/OASIS/mri_data/control/OAS30288_MR_d0897/OAS30288_Freesurfer50_d0897_brain_transformed.mgz [0]
(205,) (205, 1)
>> test >>  /home/jack/Code/GitHub/Polygence/Data/OASIS/mri_data/dementia/OAS30479_MR_d1266/OAS30479_Freesurfer53_d1266_brain_transformed.mgz [1]


In [15]:
# Save these splitted arrays for later as well
np.save('npy_files/X_train_filenames.npy', X_train_filenames)
np.save('npy_files/y_train.npy', y_train)

np.save('npy_files/X_val_filenames.npy', X_test_filenames)
np.save('npy_files/y_val.npy', y_test)

### Create a custom generator
Since the data is too large to fit it all into memory, we will have to read it in batches.

In [16]:
from tensorflow import keras

In [17]:
from skimage.transform import resize, rescale

In [18]:
class MRI_Data_Generator(keras.utils.Sequence):
    """
    A data generator that reads MRI data in batches, and returns their image data
    """
    
    def __init__(self, filenames, labels, batch_size):
        """
        Intializes the generator
        :param filenames: list containing the path to each MRI scan file, should be np.ndarray
        :param labels: labels associated with the scans in filenames (control, dementia), should be np.ndarray
        :param batch_size: the size of the batch
        """
        self.filenames = filenames
        self.labels = labels
        self.batch_size = batch_size
        
    def __len__(self):
        """
        Calculate the number of batches that we are supposed to produce.
        Returns a rounded-up integer of total number of filenames divided by batch size.
        """
        return (np.ceil(len(self.filenames) / float(self.batch_size))).astype(np.int)
        
    def __getitem__(self, idx):
        """
        Scan the data within that batch
        :param idx: the index of the batch to be selected
        """
        # Read in the items at that batch index
        # Since these two arrays are np arrays, we don't have to worry about index_out_of_bounds
        batch_X = self.filenames[idx*self.batch_size : (idx+1)*self.batch_size]
        batch_y = self.labels[idx*self.batch_size : (idx+1)*self.batch_size]
        
        # Data preprocessing
        def normalize(volume):
            """ Normalize the volume, scaling it to [0, 1] instead of [0, 255] """
            min = 0.0
            max = 255.0
            volume[volume < min] = min
            volume[volume > max] = max
            volume = (volume - min) / (max - min)
            volume = volume.astype("float32")
            return volume

        def scale(volume):
            """ Reduce the volume from (256,256,256) to (128,128,128) """
            return resize(volume, (128,128,128,1))
            # return rescale(volume, 0.5)
        
        # print(f'Currently reading in batch {idx}')
        
        batch_X_data = []
        for filename in batch_X:
            rem_fs = filename[:filename.rfind("/")]
            rem_fs = rem_fs[rem_fs[:rem_fs.rfind("/")].rfind("/")+1:]
#             print(f'Currently reading in batch {idx}: {rem_fs}')
            
            MRI_orig = mgh.load(filename)
            volume = MRI_orig.get_fdata()
            volume = normalize(volume)
            volume = scale(volume)
            batch_X_data.append(volume)
        
        np_res_X = np.array(batch_X_data)
        np_res_y = np.array(batch_y)
        # print(f'Shapes: x - {np_res_X.shape}, y - {np_res_y.shape}')
        return np_res_X, np_res_y

### Create and test our data generators

In [19]:
batch_size = 8
# batch_size = 2

training_batch_generator = MRI_Data_Generator(X_train_filenames, y_train, batch_size)
testing_batch_generator = MRI_Data_Generator(X_test_filenames, y_test, batch_size)

In [20]:
# batchx, batchy = training_batch_generator.__getitem__(0)
# print(">> Num of batches >> ", training_batch_generator.__len__())  # outputs 103, which is equal to ceil(817 / 8)

In [21]:
# print(batchx.shape, batchy.shape)
# print(batchy)

In [22]:
# batchx, batchy = testing_batch_generator.__getitem__(0)
# print(">> Num of batches >> ", testing_batch_generator.__len__()) # outputs 26, which is equal to ceil(205 / 8)

In [23]:
# print(batchx.shape, batchy.shape)
# print(batchy)

In [24]:
# delete them to save memory
# del batchx, batchy

## CNN Model
We will have one convolutional layer, a maxpooling layer, then a final dense classification layer

Resources used:
* https://towardsdatascience.com/step-by-step-implementation-3d-convolutional-neural-network-in-keras-12efbdd7b130
* https://medium.com/@mrgarg.rajat/training-on-large-datasets-that-dont-fit-in-memory-in-keras-60a974785d71

### Create the model

In [25]:
from tensorflow.keras import layers

In [26]:
# def create_model():
#     model = keras.Sequential()

#     # The input is 256x256x256x1: 256x256x256 for each scan, and 1 for the channel

#     # The layer is a Conv3D layer that extracts 64 filters with a 5x5 window
#     model.add(layers.Conv3D(filters=64, kernel_size=5, activation='relu', input_shape=(256,256,256,1)))

#     # The second layer is a max-pooling layer with a 2x2 window for down sampling
#     # I'm a little unsure of what max-pooling does exactly, but every resource I've seen used it after the Conv layer
#     model.add(layers.MaxPool3D(pool_size=2))

#     # Normalizes the batch
#     model.add(layers.BatchNormalization())

#     # Dropout layer to prevent overfitting
#     # I'm also a little unsure with this, but again, every resource has used it
#     model.add(layers.Dropout(0.25))

#     # Layer to flatten the data
#     # I'd like your input on this. Some resources have used it, while others have not.
#     # This is commented out because it runs out of memory.
#     # model.add(layers.Flatten())

#     # Fully connected layers
#     # In general, is going from 512 -> 1 too much of a sharp drop?
#     model.add(layers.Dense(512, activation='relu'))
#     # model.add(layers.Dense(256, activation='relu'))

#     # Classification/output layer
#     model.add(layers.Dense(1, activation='sigmoid'))
    
#     return model

In [27]:
def create_model():
    inputs = keras.Input((128, 128, 128, 1))

    x = layers.Conv3D(filters=32, kernel_size=5, activation="relu")(inputs)
    x = layers.MaxPool3D(pool_size=2)(x)
    x = layers.BatchNormalization()(x)
    x = layers.Dropout(0.25)(x)
    
    x = layers.GlobalAveragePooling3D()(x)
    x = layers.Dense(units=256, activation="relu")(x)
    x = layers.Dropout(0.5)(x)
    
    outputs = layers.Dense(units=1, activation="sigmoid")(x)
    
    # Define the model
    model = keras.Model(inputs, outputs, name="3d-cnn")
    return model

In [28]:
model = create_model()
model.summary()

Model: "3d-cnn"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 128, 128, 128, 1) 0         
_________________________________________________________________
conv3d (Conv3D)              (None, 124, 124, 124, 32) 4032      
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 62, 62, 62, 32)    0         
_________________________________________________________________
batch_normalization (BatchNo (None, 62, 62, 62, 32)    128       
_________________________________________________________________
dropout (Dropout)            (None, 62, 62, 62, 32)    0         
_________________________________________________________________
global_average_pooling3d (Gl (None, 32)                0         
_________________________________________________________________
dense (Dense)                (None, 256)               8448 

In [29]:
# model.compile(optimizer="adam", loss='binary_crossentropy', metrics=['accuracy'])

In [30]:
# Compile model.
initial_learning_rate = 0.0001
lr_schedule = keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate, decay_steps=100000, decay_rate=0.96, staircase=True
)
model.compile(
    loss="binary_crossentropy",
    optimizer=keras.optimizers.Adam(learning_rate=lr_schedule),
    metrics=["acc"],
)

### Train the model

In [31]:
history = model.fit(
                training_batch_generator,
                steps_per_epoch = int(X_train_filenames.shape[0] // batch_size),  # samples = batch_size * steps
                epochs = 10,
                verbose = 1,
                validation_data = testing_batch_generator,
                validation_steps = int(X_test_filenames.shape[0] // batch_size))  # samples = batch_size * steps

Epoch 1/10


UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node 3d-cnn/conv3d/Conv3D (defined at <ipython-input-31-c7e657359a77>:1) ]] [Op:__inference_train_function_1226]

Function call stack:
train_function
