# Detecting Alzheimer's from Features of Handwriting

### We aim to train a neural network to use Logistic Regression to classify handwriting samples as belonging to a patient with Alzheimer's or not. The training and testing dataset we use is the DARWIN Dataset created by Francesco Fontanella.

### Source: https://archive.ics.uci.edu/dataset/732/darwin
### Kaggle Link: https://www.kaggle.com/datasets/taeefnajib/handwriting-data-to-detect-alzheimers-disease/data


### We do the above in 3 Main Steps:

### Step 1) - Design the Model ( Input Size, Output Size, Activation Functions, Hidden Layers ) -- Forward Propagation
### Step 2) - Create Optimizer, choose Loss function ( Binary Cross Entropy Loss since this is a classification problem )
### Step 3) - Training Loop -- Backwards Propagation -- Calculate Gradients (Stochastic Gradient Descent), Calculate Loss, Update Weights & Biases

In [10]:
import torch
import torch.nn as nn
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import recall_score, precision_score, f1_score

In [11]:
### Step 0 -- Readying the Data ###
pd.options.mode.copy_on_write = True # Enabling Copy on Write so mutations to copies of df do not mutate df.
df = pd.read_csv('./data.csv').drop(labels=['ID'], axis=1)

X_data = df.drop(labels=['class'], axis = 1, inplace=False)
Y_data = df['class']

for row in range(len(Y_data)):
    if (Y_data[row] == 'P'): Y_data[row] = 1
    elif (Y_data[row] == 'H'): Y_data[row] = 0


X_data = X_data.to_numpy().astype(np.float32)
Y_data = Y_data.to_numpy().astype(np.float32)

n_samples, n_features = X_data.shape # 174 x 450 (174 samples, 450 features)

X_train, X_test, Y_train, Y_test = train_test_split(X_data, Y_data, test_size=0.2, random_state=1234)

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)

X_train = torch.from_numpy(X_train)
X_test = torch.from_numpy(X_test)

Y_train = torch.from_numpy(Y_train)
Y_test = torch.from_numpy(Y_test)

Y_train = Y_train.view(Y_train.shape[0], 1)
Y_test = Y_test.view(Y_test.shape[0], 1)

### Designing the Model

In [12]:
class LogisticRegression(nn.Module):

    def __init__(self, n_input_features):
        super(LogisticRegression, self).__init__()

        self.linear = nn.Linear(n_input_features, 1)
    
    def forward(self, X):
        # f = sigmoid(w*x + b)
        return torch.sigmoid(self.linear(X))

model = LogisticRegression(n_features)

### Optimizer ( What does thra gradient calculations ) + Loss Function

In [13]:
# Step 2) - Getting Loss function and Optimizer
# Using Binary Cross Entropy Loss & Stochastic Gradient Descent

loss_fn = nn.BCELoss()

learning_rate = 0.01

optimizer = torch.optim.SGD(model.parameters(), lr = learning_rate)

### Training Loop

In [14]:
# Step 3) - Training Loop

number_iterations = 200

for epoch in range(number_iterations):
    # Forward pass / propagation --- Calculate y_predicted
    y_predicted = model.forward(X_train)

    # Calculate loss using current y_predicted
    loss = loss_fn(y_predicted, Y_train)

    # Calculate gradient for weights and biases - Backwards Propagation
    loss.backward()

    # Update weights and biases
    optimizer.step()
    optimizer.zero_grad()

    if (epoch % 10 == 0):
        print(f"epoch = {epoch}, loss = {loss}")

epoch = 0, loss = 0.8327412605285645
epoch = 10, loss = 0.4486154317855835
epoch = 20, loss = 0.35812291502952576
epoch = 30, loss = 0.30859440565109253
epoch = 40, loss = 0.27473074197769165
epoch = 50, loss = 0.2493411898612976
epoch = 60, loss = 0.2292877584695816
epoch = 70, loss = 0.21288810670375824
epoch = 80, loss = 0.1991301327943802
epoch = 90, loss = 0.1873588263988495
epoch = 100, loss = 0.1771279275417328
epoch = 110, loss = 0.16812138259410858
epoch = 120, loss = 0.16010817885398865
epoch = 130, loss = 0.1529151052236557
epoch = 140, loss = 0.1464092880487442
epoch = 150, loss = 0.14048689603805542
epoch = 160, loss = 0.13506537675857544
epoch = 170, loss = 0.1300780177116394
epoch = 180, loss = 0.12547031044960022
epoch = 190, loss = 0.12119711935520172


##### Step 5
### Evaluating the Model

In [16]:
# Evaluating the Model
with torch.no_grad():
    y_predicted = model.forward(X_test)
    y_predicted.round_() # rounds items >= 0.5 to 1, and items < 0.5 to 0

    accuracy = y_predicted.eq(Y_test)
    accuracy = float(accuracy.sum())

    accuracy /= float(Y_test.shape[0]) # Calculating a percentage --- number of correct / number in total
    recall = recall_score(y_predicted, Y_test) # tp / (tp + fn)
    precision = precision_score(y_predicted, Y_test) # tp / (tp + fp)
    f1 = f1_score(y_predicted, Y_test) # f1 score --- reliable measurement because both classes have relatively similar amount of samples ( 51% vs 49% split )
    
    print(f'Accuracy = {accuracy:.4f}')
    print(f'Recall Score = {recall}')
    print(f'Precision Score = {precision}') 
    print(f'F1 Score = {f1}')

Accuracy = 0.8571
Recall Score = 0.875
Precision Score = 0.8235294117647058
F1 Score = 0.8484848484848485


#### With a sample of 20 training loops, 
##### the mean accuracy is ~85%
##### the mean Recall Score is 88%
##### the mean Precision Score is 82%
##### the mean F1 Score is 84%

### Analyzing the Metrics

The accuracy is reasonably high, but can be improved by optimizing the hyper parameters of the model, like the random_state passed into the training / testing function call to ensure we train on the most representative sample, as well as the numer of iterations in the training loop to ensure we don't underfit or overfit the model. From rough observations, 200 iterations is when the model performs the best.

The Recall Score represents how likely our model will be able to find all true-positives. The high Recall Score indicates that when testing the model against handwriting that is belonging to an Alzheimer's patient, our model will likely be able to correctly detect it as such. Similarly, the Precision Score tells us how reliable a 'positive' from our model actually is-- a decent / medium-to-low may imply that we need to worry about false positives a little. Lastly, the F1 Score describes the holistic reliability of the model, as it's the harmonic mean of the Recall Score and Precision Score. As expected, the average F1 score falls between the average Recall Score and average Precision Score.