# **Parkinson's Disease (PD) Wrist-Mounted Prediction Model**
<div class="alert-info">
    <p>Redback Operations: <strong>Lachesis
     </strong> </p>

## **Model Objective**

The objective of this project is to build a model that specifically utilises data retrieved from a wrist-worn device to determine if the wearer of the device is experiencing hand tremors; a potential indicator of Parkinson's Disease.

## **Model Implementation**

### **Intial Setup**

In [None]:
# Import supporting libraries

#import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC

from sklearn.metrics import classification_report, confusion_matrix

In [None]:
# Load the dataset

# Load raw HandTremorDataset.csv from desired folder
# NOTE: This datatset has 9 features, the last three are not applicable to our model and will be ignored
dataset = pd.read_csv("/content/datasets/HandTremorDataset.csv")

# Drop the readings from the 3-Axis Magnetometer
dataset = dataset.drop(columns=['mX', 'mY', 'mZ'])

print("HandTremorDataset.csv sample:\n")
print(dataset.head())

print("\naX, aY, aZ: Readings from the 3-Axis Accelerometer")
print("gX, gY, gZ: Readings from the 3-Axis Gyro")

HandTremorDataset.csv sample:

     aX     aY    aZ    gX    gY    gZ  Result
0 -2544  14340 -6864  3840  -335 -1518       1
1 -2380  14188 -5644  3888  1635 -9604       1
2  -524  14480 -7148  3888 -1663 -1491       1
3  -528  15052 -6672  3904   461 -4012       1
4 -2808  14040 -5936  3904   595  -339       1

aX, aY, aZ: Readings from the 3-Axis Accelerometer
gX, gY, gZ: Readings from the 3-Axis Gyro


In [None]:
# Analyse the dataset

# Validate integrity of the dataset
print("Dataset analysis:\n")
print(dataset.info())
print(dataset.describe())

# Check for missing values
print("\nMissing values:\n")
value_count = dataset.isnull().sum().sort_values()
value_percentage = (dataset.isnull().sum()/dataset.isnull().count()*100).sort_values()
print(pd.concat([value_count, value_percentage], axis=1, keys=['Number of values', '% Grand Total']))

Dataset analysis:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 27995 entries, 0 to 27994
Data columns (total 7 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   aX      27995 non-null  int64
 1   aY      27995 non-null  int64
 2   aZ      27995 non-null  int64
 3   gX      27995 non-null  int64
 4   gY      27995 non-null  int64
 5   gZ      27995 non-null  int64
 6   Result  27995 non-null  int64
dtypes: int64(7)
memory usage: 1.5 MB
None
                 aX            aY            aZ            gX            gY  \
count  27995.000000  27995.000000  27995.000000  27995.000000  27995.000000   
mean      54.174960   5756.362779 -13338.659904   5002.228684   -239.641400   
std     5220.775301   5201.669402   3059.030305    485.190460    999.942236   
min    -9488.000000  -3148.000000 -19600.000000   3840.000000 -12023.000000   
25%    -5332.000000    776.000000 -16260.000000   4608.000000   -752.000000   
50%     1004.000000   4144.000000 -14248.

In [None]:
# Load data into variables
def load_data():
    data = dataset
    # X contains 3D gyro and 3D accelerometer data
    X = data[['aX', 'aY', 'aZ', 'gX', 'gY', 'gZ']]
    # y contains labels (0 for no tremor, 1 for tremor)
    y = data['Result']
    return X, y

X, y = load_data()

In [None]:
# Split the data into training and test sets

# Utilise a 70/30 split for Train/Test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 42)

print("\nTrain/Test set sizes:\n")
print("X_train shape:", X_train.shape)
print("X_test shape:", X_test.shape)
print("y_train shape:", y_train.shape)
print("y_test shape:", y_test.shape)


Train/Test set sizes:

X_train shape: (19596, 6)
X_test shape: (8399, 6)
y_train shape: (19596,)
y_test shape: (8399,)


### **Random Forest Classifier**

In [None]:
# Build the Random Forest Classifier model

# Initialize the Random Forest Classifier
rf_model = RandomForestClassifier(random_state = 42)

# Train the model
rf_model.fit(X_train, y_train)

In [None]:
# Evaluate the model's performance

# Make predictions
y_pred_rf = rf_model.predict(X_test)

# Print results
print("Classification Report:")
print(classification_report(y_test, y_pred_rf))
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred_rf))

Classification Report:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00      3868
           1       1.00      1.00      1.00      4531

    accuracy                           1.00      8399
   macro avg       1.00      1.00      1.00      8399
weighted avg       1.00      1.00      1.00      8399

Confusion Matrix:
[[3856   12]
 [  12 4519]]


### **Logistic Regression**

In [None]:
# Build the Logicstic Regression model

# Initialize the Random Forest Classifier
lr_model = LogisticRegression(random_state = 42, max_iter = 1000)

# Train the model
lr_model.fit(X_train, y_train)

In [None]:
# Evaluate the model's performance

# Make predictions
y_pred_lr = lr_model.predict(X_test)

# Print results
print("Classification Report:")
print(classification_report(y_test, y_pred_lr))
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred_lr))

Classification Report:
              precision    recall  f1-score   support

           0       0.75      0.76      0.76      3868
           1       0.79      0.79      0.79      4531

    accuracy                           0.77      8399
   macro avg       0.77      0.77      0.77      8399
weighted avg       0.77      0.77      0.77      8399

Confusion Matrix:
[[2939  929]
 [ 967 3564]]


### **Decision Tree Classifier**

In [None]:
# Build the Logicstic Regression model

# Initialize the Random Forest Classifier
dt_model = DecisionTreeClassifier(random_state = 42)

# Train the model
dt_model.fit(X_train, y_train)

In [None]:
# Evaluate the model's performance

# Make predictions
y_pred_dt = dt_model.predict(X_test)

# Print results
print("Classification Report:")
print(classification_report(y_test, y_pred_dt))
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred_dt))

Classification Report:
              precision    recall  f1-score   support

           0       1.00      0.99      1.00      3868
           1       1.00      1.00      1.00      4531

    accuracy                           1.00      8399
   macro avg       1.00      1.00      1.00      8399
weighted avg       1.00      1.00      1.00      8399

Confusion Matrix:
[[3848   20]
 [  16 4515]]
