#### Testing Perceptron

#### Initialising Dataset

First we load in the pre-processed dataset

In [2]:
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import Perceptron
from sklearn.datasets import load_iris
import random
import time
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Perceptron
from sklearn.metrics import accuracy_score, ConfusionMatrixDisplay, confusion_matrix
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import precision_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score
from sklearn.tree import DecisionTreeClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import uniform
from random import randint


# Read in csv
data = pd.read_csv('data/diabetic_data_formatted.csv')


# Remove the columns that have ~0.5 or more '?'
data.drop(['weight', 'medical_specialty'], axis=1, inplace=True)

# Replace <30 or >30 days readmission to YES
data['readmitted'] = data['readmitted'].replace(1, 0)
data['readmitted'] = data['readmitted'].replace(0, 0)

# Select target column to predict
X = data.drop(columns=['readmitted'])
y = data['readmitted']


# Get the unique class names from the target variable
class_names = ['YES', 'NO']

# Encode strings to unique integers
le = LabelEncoder()

X_encoded = X
y_encoded = y

sc = StandardScaler()

#### Finding baseline performance

Next to find the baseline performance of the model we run it without any initalised parameters and use random state 42 to split the dataset into training and testing sets. We also created a function to reduce duplicated code.

In [3]:
# Define the model
ppn = Perceptron()

def getAccuracy(ppn):    
    print(ppn.get_params())

    X_train, X_test, y_train, y_test = train_test_split(X_encoded, y_encoded, test_size=0.25, random_state=42)
    
    # Train the model
    ppn.fit(X_train, y_train)
    
    # Make predictions
    y_pred = ppn.predict(X_test)
    
    # Evaluate accuracy
    print('\n Accuracy: %.2f' % accuracy_score(y_test, y_pred))

getAccuracy(ppn)

{'alpha': 0.0001, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 1000, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': True, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.47


After recieving an accuracy score of 0.47, we tried using grid search to find the optimum parameters, however even for a simple model such as perceptron it struggled to finish executing. So we instead wanted to find the top 4 most important parameters to configure and optimised those instead.

#### Finding the most impactful perceptron hyperparameters

First we listed the available hyperparameters.

In [3]:
ppn = Perceptron(
    alpha=1.0,
    penalty=None,
    max_iter=50,
    tol=0.0001,
    fit_intercept=True,
    eta0=0.0001,
    shuffle=True,
    verbose=0,
    warm_start=False,
    class_weight='balanced'
)

#### Choosing the most impactful parameters

##### Finding optimum value for max_iter

First we decided that max_iter is the most impactful parameter to the accuracy score, as almost all the other parameters need at least more than 1 iteration to be used. This is because a perceptron works by taking in input values, calculating a weighted sum and then passes them through an activation function to filter them into the classes, the perceptron then adjusts the weights from the previous iteration to improve its accuracy. Without a suitable number of iterations, all the other parameters will be negligable in performance.

Testing max_iter values at 50, 100, 200

In [4]:
ppn = Perceptron(
    max_iter=50,
)

getAccuracy(ppn)

{'alpha': 0.0001, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 50, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': True, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.47


In [5]:
ppn = Perceptron(
    max_iter=100,
)

getAccuracy(ppn)

{'alpha': 0.0001, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 100, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': True, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.47


In [6]:
ppn = Perceptron(
    max_iter=200,
)

getAccuracy(ppn)

{'alpha': 0.0001, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 200, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': True, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.47


We found that after 50 iterations the model does not improve in accuracy score, so the best parameter for max_iter is 50.

##### Finding optimum value for alpha

Next we decided that alpha is the 2nd most impactful parameter to the accuracy score, as alpha controls the amount of regularisation and can prevent overfitting. Our data is particularly noisy, as it is filled with attributes from human data (weight, age, glucose levels, etc.), meaning there is a lot of variation and randomness, selecting a good alpha score can assist in avoidance of overfitting.

Testing alpha values at 1e-5, 1e-3, 1

In [7]:
ppn = Perceptron(
    max_iter=50,
    alpha = 1e-5
)

getAccuracy(ppn)

{'alpha': 1e-05, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 50, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': True, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.47


In [8]:
ppn = Perceptron(
    max_iter=50,
    alpha = 1e-3
)

getAccuracy(ppn)

{'alpha': 0.001, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 50, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': True, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.47


In [9]:
ppn = Perceptron(
    max_iter=50,
    alpha = 1
)

getAccuracy(ppn)

{'alpha': 1, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 50, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': True, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.47


There was no difference when changing the alpha value, as the accuracy remained at 0.47. We opted to choose the middle value, 1e-3 to avoid overfitting and underfitting of the data.

##### Finding optimum value for shuffle and class_weight

It seemed that the default parameter values for max_iter, alpha and tol were enough to reach the peak accuracy score of 0.47. So next, we wanted to try turning off shuffle so the model can find patterns within the data better and reach a higher accuracy score.

In [10]:
ppn = Perceptron(
    alpha=1e-3,
    max_iter=50,
    shuffle=False,
)

getAccuracy(ppn)

{'alpha': 0.001, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 50, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': False, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.48


This gained us a little boost in accuracy of 0.01, at the cost of the model potentially being too biased to the original data.

##### Finding optimum value for class_weight

We noticed the data was quite unbalanced with 52833 values of patients who do have not been readmitted, with only 45595 patients who have been readmitted, making a difference of 7238. So we wanted to try and change the value of the class_weight parameter to balanced.

In [11]:
print(data['readmitted'].value_counts())

readmitted
2    52833
0    45595
Name: count, dtype: int64


In [4]:
ppn = Perceptron(
    alpha=1e-3,
    max_iter=50,
    shuffle=False,
    class_weight=None
)

getAccuracy(ppn)

{'alpha': 0.001, 'class_weight': None, 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 50, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': False, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.48


In [5]:
ppn = Perceptron(
    alpha=1e-3,
    max_iter=50,
    shuffle=False,
    class_weight='balanced'
)

getAccuracy(ppn)

{'alpha': 0.001, 'class_weight': 'balanced', 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 50, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': False, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.58


After doing so we reached the highest score for this model at 0.58, however this is with shuffle set to false, for a robust model it would be much more preferable to have shuffle on, so that the model can handle randomness in real input data.

#### Scaling data for improved performance

We set shuffle to True, to improve robustness, and then scaled the data to assist in perceptron performance.

In [14]:

X_train, X_test, y_train, y_test = train_test_split(X_encoded, y_encoded, test_size=0.25, random_state=42)

ppn = Perceptron(
    alpha=1e-3,
    max_iter=1000,
    shuffle=True,
    class_weight='balanced'
)

print(ppn.get_params())

sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

# Train the model
ppn.fit(X_train_std, y_train)

# Make predictions
y_pred = ppn.predict(X_test_std)

# Evaluate accuracy
print('\n Accuracy: %.2f' % accuracy_score(y_test, y_pred))

{'alpha': 0.001, 'class_weight': 'balanced', 'early_stopping': False, 'eta0': 1.0, 'fit_intercept': True, 'l1_ratio': 0.15, 'max_iter': 1000, 'n_iter_no_change': 5, 'n_jobs': None, 'penalty': None, 'random_state': 0, 'shuffle': True, 'tol': 0.001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}

 Accuracy: 0.55


We achieved a score of 0.55 which is the best score we could achieve while keeping shuffle set to True.