# An Experimental Study on Speech Based Parkinson's Disease Detection With Feature Scaling and Selection - ML Notebook.



In this Notebook we explore the preditive capabilities of baseline ML models on the task of speech feature based Parkinson's disease detection, when it is coupled with Feature Scaling and Selection. The experimental results from this notebook can be used to establish an understanding of how Feature Scaling and Selection can impact the overall classification performance. Since according to the baseline classification results Gradient Boosting Classifier achieved the best classification results.

For the purpose experimentation a publicly available dataset is utilized, 



*   Sakar, C., Serbes, Gorkem, Gunduz, Aysegul, Nizam, Hatice & Sakar, Betul. (2018). Parkinson's Disease Classification. UCI Machine Learning Repository. (https://archive-beta.ics.uci.edu/ml/datasets/parkinson+s+disease+classification)






The dataset is present as a csv file, which is worked on next.

In [2]:
# Connect to Google Drive to access dataset
from google.colab import drive

drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [3]:
# Check dataset filenames
import os
import pandas as pd
import numpy as np

pd_df = pd.read_csv('/content/gdrive/My Drive/pd_speech_features.csv', header=None)
pd_df = pd_df.drop(pd_df.index[0])
new_header = pd_df.iloc[0] #grab the first row for the header
pd_df = pd_df[1:] #take the data less the header row
pd_df.columns = new_header #set the header row as the df header
pd_df.head()

1,id,gender,PPE,DFA,RPDE,numPulses,numPeriodsPulses,meanPeriodPulses,stdDevPeriodPulses,locPctJitter,...,tqwt_kurtosisValue_dec_28,tqwt_kurtosisValue_dec_29,tqwt_kurtosisValue_dec_30,tqwt_kurtosisValue_dec_31,tqwt_kurtosisValue_dec_32,tqwt_kurtosisValue_dec_33,tqwt_kurtosisValue_dec_34,tqwt_kurtosisValue_dec_35,tqwt_kurtosisValue_dec_36,class
2,0,1,0.85247,0.71826,0.57227,240,239,0.00806353,8.68e-05,0.00218,...,1.562,2.6445,3.8686,4.2105,5.1221,4.4625,2.6202,3.0004,18.9405,1
3,0,1,0.76686,0.69481,0.53966,234,233,0.008258256,7.31e-05,0.00195,...,1.5589,3.6107,23.5155,14.1962,11.0261,9.5082,6.5245,6.3431,45.178,1
4,0,1,0.85083,0.67604,0.58982,232,231,0.00833959,6.04e-05,0.00176,...,1.5643,2.3308,9.4959,10.7458,11.0177,4.8066,2.9199,3.1495,4.7666,1
5,1,0,0.41121,0.79672,0.59257,178,177,0.010857733,0.000182739,0.00419,...,3.7805,3.5664,5.2558,14.0403,4.2235,4.6857,4.846,6.265,4.0603,1
6,1,0,0.3279,0.79782,0.53028,236,235,0.008161574,0.002668863,0.00535,...,6.1727,5.8416,6.0805,5.7621,7.7817,11.6891,8.2103,5.0559,6.1164,1




In [4]:
# drop the id column
pd_df = pd_df.drop(['id'],axis = 1)

# separate out class labels and then drop from dataframe
pd_labels_df = pd_df[['class']]
pd_features_df = pd_df.drop(['class'],axis=1)

print(pd_features_df.head())
print(pd_labels_df.head())

1 gender      PPE      DFA     RPDE numPulses numPeriodsPulses  \
2      1  0.85247  0.71826  0.57227       240              239   
3      1  0.76686  0.69481  0.53966       234              233   
4      1  0.85083  0.67604  0.58982       232              231   
5      0  0.41121  0.79672  0.59257       178              177   
6      0   0.3279  0.79782  0.53028       236              235   

1 meanPeriodPulses stdDevPeriodPulses locPctJitter locAbsJitter  ...  \
2       0.00806353           8.68E-05      0.00218     1.76E-05  ...   
3      0.008258256           7.31E-05      0.00195     1.61E-05  ...   
4       0.00833959           6.04E-05      0.00176     1.47E-05  ...   
5      0.010857733        0.000182739      0.00419     4.55E-05  ...   
6      0.008161574        0.002668863      0.00535     4.37E-05  ...   

1 tqwt_kurtosisValue_dec_27 tqwt_kurtosisValue_dec_28  \
2                    1.5466                     1.562   
3                     1.553                    1.5589   

In [5]:
allData = pd_features_df.values.tolist()
allLabels = pd_labels_df.values.tolist()

allData = np.array(allData).astype('float64')
allLabels = np.array(allLabels).astype('int')

allData = np.array(allData)
allLabels = np.array(allLabels).ravel()

print(allData.shape)
print(allLabels.shape)

(756, 753)
(756,)


### Establish a Baseline with GBC Model

In [6]:
# First we establish a baseline with GBC, with no preprocessing.
from sklearn.ensemble import GradientBoostingClassifier as GBC
from sklearn.metrics import confusion_matrix as cm
from sklearn import metrics
from sklearn.model_selection import train_test_split

accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  clf = GBC()
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('GBC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..
GBC classification results: 

Average Accuracy:  0.8855263157894736
Average Sensitivity:  0.8858855026650322
Average Specificity:  0.8874589753822438
Average AUC:  0.8036215112321307


### Evaluate the Utility of MinMaxScaler

In [7]:
# Next we evaluate the utility of MinMaxScaler based Feature Scaling.
from sklearn.preprocessing import MinMaxScaler

accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  # Fit a scaler according to xTrain, then transform both xTrain and xTest
  scaler = MinMaxScaler()
  scaler.fit(xTrain)
  xTrain = scaler.transform(xTrain)
  xTest = scaler.transform(xTest)

  clf = GBC()
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('GBC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..
GBC classification results: 

Average Accuracy:  0.8817105263157895
Average Sensitivity:  0.8833005396240781
Average Specificity:  0.8788876466923539
Average AUC:  0.7983685046516906


### Evaluate the Utility of StandardScaler

In [8]:
# Next we evaluate the utility of MinMaxScaler based Feature Scaling.
from sklearn.preprocessing import StandardScaler

accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  # Fit a scaler according to xTrain, then transform both xTrain and xTest
  scaler = StandardScaler()
  scaler.fit(xTrain)
  xTrain = scaler.transform(xTrain)
  xTest = scaler.transform(xTest)

  clf = GBC()
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('GBC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..
GBC classification results: 

Average Accuracy:  0.8810526315789474
Average Sensitivity:  0.8825574769857225
Average Specificity:  0.8798134931004691
Average AUC:  0.7970864533696391


### Evaluate RFECV Based Feature Selection

In [9]:
from sklearn.feature_selection import RFECV
from sklearn.ensemble import RandomForestClassifier as RFC
estimatorModel = RFC()

# Next we evaluate the utility of MinMaxScaler based Feature Scaling.
from sklearn.preprocessing import StandardScaler

accList = list()
sensList = list()
specList = list()
aucList = list()

for i in range(50):
  if i % 10 == 0:
    print('Experiment on Train-Test split number '+str(i+1)+' started..')

  xTrain, xTest, yTrain, yTest = train_test_split(allData, allLabels, test_size=0.2, stratify=allLabels)

  # Feature Selection by RFECV
  selector = RFECV(estimatorModel, step=20, cv=3)
  selector = selector.fit(xTrain, yTrain)
  xTrain = selector.transform(xTrain)
  xTest = selector.transform(xTest)

  clf = GBC()
  clf.fit(xTrain,yTrain)

  binPreds = clf.predict(xTest)

  TN, FP, FN, TP = cm(binPreds, yTest).ravel()

  sens = TP / (TP + FN)
  spec = TN / (TN + FP)
  acc = (TP + TN) / (TP + FP + TN + FN)


  accList.append(acc)
  sensList.append(sens)
  specList.append(spec)

  fpr, tpr, thresholds = metrics.roc_curve(yTest, binPreds, pos_label=1)
  aucList.append(metrics.auc(fpr, tpr))

print('GBC classification results: \n')
print('Average Accuracy: ',np.mean(accList))
print('Average Sensitivity: ',np.mean(sensList))
print('Average Specificity: ',np.mean(specList))
print('Average AUC: ',np.mean(aucList))

Experiment on Train-Test split number 1 started..
Experiment on Train-Test split number 11 started..
Experiment on Train-Test split number 21 started..
Experiment on Train-Test split number 31 started..
Experiment on Train-Test split number 41 started..
GBC classification results: 

Average Accuracy:  0.8828947368421054
Average Sensitivity:  0.8881704989373824
Average Specificity:  0.8650476031007328
Average AUC:  0.8053778080326753


### Overall findings

GBC Baseline Results

1. Average Accuracy:  0.8855263157894736
2. Average Sensitivity:  0.8858855026650322
3. Average Specificity:  0.8874589753822438
4. Average AUC:  0.8036215112321307

GBC-MinMaxScaler Results

1. Average Accuracy:  0.8817105263157895
2. Average Sensitivity:  0.8833005396240781
3. Average Specificity:  0.8788876466923539
4. Average AUC:  0.7983685046516906

GBC-StandardScaler Results

1. Average Accuracy:  0.8810526315789474
2. Average Sensitivity:  0.8825574769857225
3. Average Specificity:  0.8798134931004691
4. Average AUC:  0.7970864533696391

GBC-RFECV Results

1. Average Accuracy:  0.8828947368421054
2. Average Sensitivity:  0.8881704989373824
3. Average Specificity:  0.8650476031007328
4. Average AUC:  0.8053778080326753

From these experimental results it is clear that feature scaling is definitely not helping.

Feature selection does improve the Avg. AUC and Sensitivity very slightly, however it decreases the Avg. Accuracy and Specificity in a similar way.

So far none of these approaches seems to be effective at improving the overall classification performances.