In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from processing_functions import create_numeric_labels, process_and_save
from classification_models import kNN_cross, RF_cross, MLP_cross
import warnings
warnings.simplefilter('ignore')

In this notebook I'm going to attempt to reproduce the findings in the dataset paper regarding the five benchmark models they tried: LightGBM (hyperparameter optimized by Hyperopt), Multilayer Perceptron (1 hidden layer of 3 neurons), Random Forests, Support Vector Machine (polynomial kernel) and kNearestNeighbor (k = 3). They considered both the AD-CN (Alzheimer's vs. Healthy) and the FTD-CN (Frontotemporal Dementia vs. Healthy) classification problems. The metrics they reported were accuracy, sensitivity, specificity, and F1 score, obtained by Leave-One-Subject-Out cross-validation. This cross-validation method iterates through all subjects, iteratively leaves out one subject at a time, builds a model using the rest of the dataset, and then computes the confusion matrix for that model's predictions on the data corresponding to the left-out subject. These confusion matrices are then summed over the results corresponding to each subject being left out and the metrics are then computed from the resulting total confusion matrix. 

I will be using two different processing methods to obtain the relative band power and comparing the results. The first method is the one indicated in the dataset paper, which takes epoch_length = 2000 (4 seconds) and nperseg = 256 (default value, frequency resolution ~ 1.95). The other method is one I found that suggested in a sleep research blog post that also partially conforms with the method used in the CNN paper, which is to take epoch_length = 15000 (30 seconds) and nperseg = 2000 (frequency resolution 0.25).  I'll be referring to the first as "short epochs" and the second as "long epochs". The short epochs method has the advantage of producing a much larger dataset to train on but has the disadvantage that the frequency resolution is far too low to accurately capture the lower cutoff of the Delta range or even the cutoff between the Alpha and Beta ranges. The long epochs method produces a much smaller dataset but has the advantage of allowing precise integration over each of the five frequency bands. 

To obtain the .npy files used in this notebook you should run the following code block uncommented at the end of the data_processing notebook (after running the imports and the function definition blocks). You can of course also experiment with other choices of parameters.

In [None]:
process_and_save(epoch_length=2000,overlap_ratio=0.5,freq_bands=np.array([0.5,4.0,8.0,13.0,25.0,45.0])
                 ,nperseg=256,filenames=['processed_data/short_num_epochs','processed_data/short_rbp'])
process_and_save(epoch_length=15000,overlap_ratio=0.5,freq_bands=np.array([0.5,4.0,8.0,13.0,25.0,45.0]),
                 nperseg=2000,filenames=['processed_data/long_num_epochs','processed_data/long_rbp'])

If everything goes well then the following code block should execute.

In [2]:
short_num_epochs = np.load('processed_data/short_num_epochs.npy')
long_num_epochs = np.load('processed_data/long_num_epochs.npy')
short_rbp = np.load('processed_data/short_rbp.npy')
long_rbp = np.load('processed_data/long_rbp.npy')

The shapes should be (88,), (88,), (88,639,5,19), (88,84,5,19).

In [None]:
print(short_num_epochs.shape)
print(long_num_epochs.shape)
print(short_rbp.shape)
print(long_rbp.shape)

The following piece of code creates numerical labels for the target variable: 0 for healthy group, 1 for Alzheimer's, and 2 for Frontotemporal dementia. The subject indices for the resulting array are aligned with the subject indices (the first dimension) for each of the arrays we loaded in. 

In [3]:
ppt_diagnostics = pd.read_csv('data/ds004504/participants.tsv',sep='\t')
target_labels = ppt_diagnostics['Group'].apply(create_numeric_labels).values
target_labels

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

Below we do leave-one-subject-out cross-validation for a two class kNN classifier for Alzheimer's vs Healthy with k = 3. Both the short and long epoch versions are done. 

First the short version.

In [None]:
short_ThreeNN_metrics = kNN_cross(short_rbp,short_num_epochs,target_labels,removed_class='F',n_neighbors=3)
short_ThreeNN_metrics

Now the long version.

In [None]:
long_ThreeNN_metrics = kNN_cross(long_rbp,long_num_epochs,target_labels,removed_class='F',n_neighbors=3)
long_ThreeNN_metrics

Note the significant boost in performance (3-4% across all 4 metrics). They are still each about 2% worse than the paper's reported perforamnce of kNN on these metrics with the short epoch version. Still not sure why that's the case. 

Below we also look at the performance of the classifier for healthy vs. FTD for the short epoch version and long epoch version. 

In [None]:
short_ThreeNN_metrics = kNN_cross(short_rbp,short_num_epochs,target_labels,removed_class='A',n_neighbors=3)
short_ThreeNN_metrics

In [None]:
long_ThreeNN_metrics = kNN_cross(long_rbp,long_num_epochs,target_labels,removed_class='A',n_neighbors=3)
long_ThreeNN_metrics

The long epoch version has quite good accuracy and specificity for FTD vs. healthy (with the short version not being terrible either), though we can see that this comes at the cost of worse sensititivy and worse F1 scores. The low sensitivity and high specificity for both the short and long version in particular means that the classifier is not great at identifying FTD cases as being FTD. 

# Random Forest
Next we look at the same performance metrics for Random Forest classifier. Let's start with the long and short metrics for AD vs healthy problem. 

In [None]:
short_RF_metrics = RF_cross(short_rbp,short_num_epochs,target_labels,removed_class='F')
short_RF_metrics

In [None]:
long_RF_metrics = RF_cross(long_rbp,long_num_epochs,target_labels,removed_class='F',
                           min_samples_split = 0.005, PCA_components = 60)
long_RF_metrics

In [None]:
long_RF_metrics = RF_cross(long_rbp,long_num_epochs,target_labels,removed_class='C')
long_RF_metrics

# Multi Layer Perceptron Model
Next we look at the same performance metrics for MLP classifier

In [None]:
short_MLP_metrics = MLP_cross(short_rbp,short_num_epochs,target_labels,removed_class='F')
short_MLP_metrics

In [None]:
long_MLP_metrics = MLP_cross(long_rbp,long_num_epochs,target_labels,removed_class='F')
long_MLP_metrics