# An Experimental Study on Speech Based Parkinson's Disease Detection - Baseline ML Notebook.



In this Notebook we explore the preditive capabilities of baseline ML models on the task of speech feature based Parkinson's disease detection. Here no preprocessing, data science approach is applied, only off the shelf ML models are considered. The experimental results from this notebook can be used to establish a baseline for future advanced approaches.

For the purpose experimentation a publicly available dataset is utilized, 



*   Sakar, C., Serbes, Gorkem, Gunduz, Aysegul, Nizam, Hatice & Sakar, Betul. (2018). Parkinson's Disease Classification. UCI Machine Learning Repository. (https://archive-beta.ics.uci.edu/ml/datasets/parkinson+s+disease+classification)






The dataset is present as a csv file, which is worked on next.

In [1]:
# Connect to Google Drive to access dataset
from google.colab import drive

drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [10]:
# Check dataset filenames
import os
import pandas as pd
import numpy as np

pd_df = pd.read_csv('/content/gdrive/My Drive/pd_speech_features.csv', header=None)
pd_df = pd_df.drop(pd_df.index[0])
new_header = pd_df.iloc[0] #grab the first row for the header
pd_df = pd_df[1:] #take the data less the header row
pd_df.columns = new_header #set the header row as the df header
pd_df.head()

1,id,gender,PPE,DFA,RPDE,numPulses,numPeriodsPulses,meanPeriodPulses,stdDevPeriodPulses,locPctJitter,...,tqwt_kurtosisValue_dec_28,tqwt_kurtosisValue_dec_29,tqwt_kurtosisValue_dec_30,tqwt_kurtosisValue_dec_31,tqwt_kurtosisValue_dec_32,tqwt_kurtosisValue_dec_33,tqwt_kurtosisValue_dec_34,tqwt_kurtosisValue_dec_35,tqwt_kurtosisValue_dec_36,class
2,0,1,0.85247,0.71826,0.57227,240,239,0.00806353,8.68e-05,0.00218,...,1.562,2.6445,3.8686,4.2105,5.1221,4.4625,2.6202,3.0004,18.9405,1
3,0,1,0.76686,0.69481,0.53966,234,233,0.008258256,7.31e-05,0.00195,...,1.5589,3.6107,23.5155,14.1962,11.0261,9.5082,6.5245,6.3431,45.178,1
4,0,1,0.85083,0.67604,0.58982,232,231,0.00833959,6.04e-05,0.00176,...,1.5643,2.3308,9.4959,10.7458,11.0177,4.8066,2.9199,3.1495,4.7666,1
5,1,0,0.41121,0.79672,0.59257,178,177,0.010857733,0.000182739,0.00419,...,3.7805,3.5664,5.2558,14.0403,4.2235,4.6857,4.846,6.265,4.0603,1
6,1,0,0.3279,0.79782,0.53028,236,235,0.008161574,0.002668863,0.00535,...,6.1727,5.8416,6.0805,5.7621,7.7817,11.6891,8.2103,5.0559,6.1164,1


In [11]:
# drop the id column
pd_df = pd_df.drop(['id'],axis = 1)

# separate out class labels and then drop from dataframe
pd_labels_df = pd_df[['class']]
pd_features_df = pd_df.drop(['class'],axis=1)

print(pd_features_df.head())
print(pd_labels_df.head())

1 gender      PPE      DFA     RPDE numPulses numPeriodsPulses  \
2      1  0.85247  0.71826  0.57227       240              239   
3      1  0.76686  0.69481  0.53966       234              233   
4      1  0.85083  0.67604  0.58982       232              231   
5      0  0.41121  0.79672  0.59257       178              177   
6      0   0.3279  0.79782  0.53028       236              235   

1 meanPeriodPulses stdDevPeriodPulses locPctJitter locAbsJitter  ...  \
2       0.00806353           8.68E-05      0.00218     1.76E-05  ...   
3      0.008258256           7.31E-05      0.00195     1.61E-05  ...   
4       0.00833959           6.04E-05      0.00176     1.47E-05  ...   
5      0.010857733        0.000182739      0.00419     4.55E-05  ...   
6      0.008161574        0.002668863      0.00535     4.37E-05  ...   

1 tqwt_kurtosisValue_dec_27 tqwt_kurtosisValue_dec_28  \
2                    1.5466                     1.562   
3                     1.553                    1.5589   

In [21]:
allData = pd_features_df.values.tolist()
allLabels = pd_labels_df.values.tolist()

allData = np.array(allData).astype('float64')
allLabels = np.array(allLabels).astype('int')

allData = np.array(allData)
allLabels = np.array(allLabels).ravel()

print(allData.shape)
print(allLabels.shape)

(756, 753)
(756,)


In [27]:
# Random Forest Model training on training data and Evaluation on test data
from sklearn.ensemble import RandomForestClassifier as RFC
from sklearn.metrics import confusion_matrix as cm
from sklearn import metrics
from sklearn.model_selection import train_test_split

accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  clf = RFC()
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('RFC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..
RFC classification results: 

Average Accuracy:  0.8705263157894737
Average Sensitivity:  0.8695012479151493
Average Specificity:  0.8826512646634836
Average AUC:  0.7740549126389834


In [29]:
# AdaBoost Model Training/Testing

from sklearn.ensemble import AdaBoostClassifier as ABC

accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  clf = ABC()
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('\n ABC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..

 ABC classification results: 

Average Accuracy:  0.8696052631578947
Average Sensitivity:  0.8948193002718604
Average Specificity:  0.787400440228278
Average AUC:  0.8073542092126162


In [30]:
# Gradient Boosting Model Training/Testing

from sklearn.ensemble import GradientBoostingClassifier as GBC

accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  clf = GBC()
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('\n ABC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..

 ABC classification results: 

Average Accuracy:  0.8844736842105263
Average Sensitivity:  0.8843616793711168
Average Specificity:  0.8903755180869922
Average AUC:  0.8010664851372816


In [34]:
# Support Vector Machine Model With RBF Kernel Training/Testing

from sklearn.svm import LinearSVC as SVC


accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  clf = SVC(dual=False)
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('\n LinearSVC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..

 LinearSVC classification results: 

Average Accuracy:  0.7607894736842107
Average Sensitivity:  0.7714190663514909
Average Specificity:  0.6278502001954014
Average AUC:  0.5675970047651464


In [36]:
# Support Vector Machine Model With RBF Kernel Training/Testing

from sklearn.linear_model import LogisticRegression as LR


accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  clf = LR(dual=False, max_iter=100000)
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('\n LR classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..

 LR classification results: 

Average Accuracy:  0.7618421052631577
Average Sensitivity:  0.7759643746809337
Average Specificity:  0.626425438727229
Average AUC:  0.5777081915135013


In [37]:
# Results for Ensemble of heterogeneous classifiers

accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  clf1 = RFC()
  clf2 = ABC()
  clf3 = GBC()
  clf4 = SVC(dual=False)
  clf5 = LR(dual=False, max_iter=100000)
  clf1.fit(xTrain,yTrain)
  clf2.fit(xTrain,yTrain)
  clf3.fit(xTrain,yTrain)
  clf4.fit(xTrain,yTrain)
  clf5.fit(xTrain,yTrain)

  binPreds1 = clf1.predict(xTest)
  binPreds2 = clf2.predict(xTest)
  binPreds3 = clf3.predict(xTest)
  binPreds4 = clf4.predict(xTest)
  binPreds5 = clf5.predict(xTest)

  binPreds = list()
  for i in range(len(binPreds1)):
    if binPreds1[i] + binPreds2[i] + binPreds3[i] + binPreds4[i] + binPreds5[i] >= 3:
      binPreds.append(1)
    else:
      binPreds.append(0)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('\n ABC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..

 ABC classification results: 

Average Accuracy:  0.8625000000000002
Average Sensitivity:  0.8547130164234916
Average Specificity:  0.9136508264395629
Average AUC:  0.7485069208078058


Random Forest Classification Results

1. Average Accuracy:  0.8705263157894737
2. Average Sensitivity:  0.8695012479151493
3. Average Specificity:  0.8826512646634836
4. Average AUC:  0.7740549126389834

AdaBoost Classification Results

1. Average Accuracy:  0.8696052631578947
2. Average Sensitivity:  0.8948193002718604
3. Average Specificity:  0.787400440228278
4. Average AUC:  0.8073542092126162

Gradient Boosting Classifier Results

1. Average Accuracy:  0.8844736842105263
2. Average Sensitivity:  0.8843616793711168
3. Average Specificity:  0.8903755180869922
4. Average AUC:  0.8010664851372816

Support Vector Machine with Linear Kernel Results

1. Average Accuracy:  0.7607894736842107
2. Average Sensitivity:  0.7714190663514909
3. Average Specificity:  0.6278502001954014
4. Average AUC:  0.5675970047651464

Logistic Regression Results

1. Average Accuracy:  0.7618421052631577
2. Average Sensitivity:  0.7759643746809337
3. Average Specificity:  0.626425438727229
4. Average AUC:  0.5777081915135013

Heterogeneous Classifier Ensembling Results

1. Average Accuracy:  0.8625000000000002
2. Average Sensitivity:  0.8547130164234916
3. Average Specificity:  0.9136508264395629
4. Average AUC:  0.7485069208078058



We observe that overall the Gradient Boosting Classifier is achieving the best results. 

Gradient Boosting Classifier Results (Best Overall)

1. Average Accuracy: 0.8844736842105263
2. Average Sensitivity: 0.8843616793711168
3. Average Specificity: 0.8903755180869922
4. Average AUC: 0.8010664851372816

Heterogenerous classifier ensembling didn't improve the predictions all round, however it did improve the specificity measure by 2-3%