FINAL PERCEPTRON MODEL

# Library Imports

*	Numpy and Pandas are used for data manipulation and numerical operations.
*	Scikit-learn provides tools for model creation, data splitting, scaling, and performance evaluation.
*	MLPClassifier, Perceptron, train_test_split, StandardScaler, and various metrics like accuracy_score and confusion_matrix are used for model building and evaluation.
*	urllib is used to download the dataset from a URL.


In [10]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_svmlight_file
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.linear_model import Perceptron
from sklearn.neural_network import MLPClassifier
import urllib.request

# Loading and Conversion of Data

*   The diabetes dataset in SVMLight format is downloaded using the URL and stored locally as 'diabetes_scale.libsvm'.
*   load_svmlight_file loads the dataset from this file, returning the feature matrix (X) and target vector (y).
*   X.toarray() converts the sparse matrix format of the dataset into a dense format (regular NumPy array).
*   The target vector y, which originally has values of -1 and 1, is converted into binary labels 0 and 1 using the transformation (y + 1) // 2.


In [11]:
# Load and prepare the data
url = 'https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/diabetes_scale'
urllib.request.urlretrieve(url, 'diabetes_scale.libsvm')
X, y = load_svmlight_file('diabetes_scale.libsvm')

# Convert the sparse matrix to dense format
X_dense = X.toarray()

# Convert target variable from -1 and 1 to 0 and 1
y = (y + 1) // 2

# Data Splitting

*	train_test_split divides the data into:
  - Training set (70% of the data).
  - Temporary set (30% of the data) that is split again into:
        - Validation set (15% of the overall data).
        - Test set (15% of the overall data).


In [13]:
# Split the dataset into training, validation, and test sets (70% training, 15% validation, 15% test)
X_train, X_temp, y_train, y_temp = train_test_split(X_dense, y, test_size=0.3,random_state=1882025)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5,random_state=1882025)

# Feature Scaling

**StandardScaler** is initialized to standardize the data (mean=0, variance=1). It is fitted to the training data and applied to both the training, validation, and test sets, ensuring consistent scaling across the data.


In [14]:
# Initialize the StandardScaler
scaler = StandardScaler()

# Fit the scaler to the training data and transform the datasets
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)

# Customized Perceptron

This part of the code defines a custom perceptron with different activation functions.

- Attributes:
  - **learning_rate**: The step size for adjusting weights.
  -	**n_iterations**: Number of training iterations.
  -	**activation**: The activation function, which could be 'step', 'sigmoid', or 'ReLU'.
  -	**with_bias**: Whether to use bias in the model.

- Methods:
  -	**activate**: Defines the activation functions (step, sigmoid, or ReLU) applied to the model's linear output.
  -	**fit**: The method that trains the model by adjusting weights using the perceptron rule. For each data point, the weights are updated based on the difference between the predicted and actual values.
  -	**predict**: Uses the learned weights to make predictions on new data, applying the activation function to the weighted sum of inputs.


In [15]:
class defined_Perceptron:
    def __init__(self, learning_rate=0.01, n_iterations=1000, activation='step', with_bias=True):
        self.lr = learning_rate
        self.n_iterations = n_iterations
        self.activation = activation
        self.with_bias = with_bias
        self.weights = None
        self.bias = 0 if with_bias else None

    # Single function describing the different activation functions which can help us fit the model with different activation functions.
    def activate(self, x):
        if self.activation == 'step':
            return 1 if x >= 0 else 0
        elif self.activation == 'sigmoid':
            return 1 / (1 + np.exp(-x))
        elif self.activation == 'relu':
            return max(0, x)

    # Fitting the data with varied weights and bias (both with and without)
    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)

        for _ in range(self.n_iterations):
            for idx, x_i in enumerate(X):
                linear_output = np.dot(x_i, self.weights) + (self.bias if self.with_bias else 0)
                y_predicted = self.activate(linear_output)

                update = self.lr * (y[idx] - y_predicted)
                self.weights += update * x_i
                if self.with_bias:
                    self.bias += update

    # Function to predict the model
    def predict(self, X):
        linear_output = np.dot(X, self.weights) + (self.bias if self.with_bias else 0)
        # Apply thresholding
        # Coverting continous to binary
        return np.array([1 if self.activate(x) >= 0.5 else 0 for x in linear_output])

# Models

The following models are initialized:
-	Basic Perceptron: In-built perceptron from scikit-learn.
-	MLP (Multi-Layer Perceptron): In-built artificial neural network model from scikit-learn with one hidden layer.
-	Custom Perceptron Models: Customised perceptron with various configurations according to:
  -	Whether they use a bias.
  -	The type of activation function (ReLU, sigmoid, or step).


In [16]:
# Initialize models
models = {
    'Basic Perceptron': Perceptron(random_state=1882025),
    'MLP': MLPClassifier(hidden_layer_sizes=(100,), max_iter=1000),
    'Perceptron No Bias': defined_Perceptron(with_bias=False),
    'Perceptron Updated Bias': defined_Perceptron(with_bias=True),
    'Perceptron ReLU': defined_Perceptron(activation='relu'),
    'Perceptron Sigmoid': defined_Perceptron(activation='sigmoid')
}

# Training and Evaluation

For each model:
-	Training:
  -	Scikit-learn models (Basic Perceptron and MLP) use their built-in fit function.
  -	The custom perceptron models use the fit method defined in the CustomPerceptron class.
-	Prediction: The models predict labels for the test set (X_test_scaled).
-	Evaluation:
  -	Accuracy is computed using accuracy_score.
  -	The confusion matrix is computed, which provides detailed performance information:
      -	True Positives, True Negatives, False Positives, and False Negatives.
      -	The number of false negatives (cases where the model predicted 0 but the actual label was 1) is extracted from the confusion matrix to evaluate how often the model misses positive cases.


In [17]:
# Dictionary to store the evaluation results
results = {}

# Train and evaluate models
for name, model in models.items():
    if isinstance(model, MLPClassifier) or isinstance(model, Perceptron):
        # For scikit-learn models
        model.fit(X_train_scaled, y_train)
        y_pred = model.predict(X_test_scaled)
    else:
        # For custom Perceptron models
        model.fit(X_train_scaled, y_train)
        y_pred = model.predict(X_test_scaled)

    # Calculate accuracy and confusion matrix
    accuracy = accuracy_score(y_test, y_pred)
    cm = confusion_matrix(y_test, y_pred)
    false_negatives = cm[1, 0]  # False negatives (assuming class 1 is positive)

    # Store the results
    results[name] = {
        'Accuracy': accuracy,
        'False Negatives': false_negatives,
        'Confusion Matrix': cm
    }



#  Results

-	The accuracy and false negatives for each model are stored in a dictionary (results).
-	The model with the highest accuracy is identified and printed as the "Best Model."


In [8]:
best_model_name = None
best_accuracy = 0

print("Model Comparison Results:")
for model, metrics in results.items():
    print(f"\nModel: {model}")
    print(f"Accuracy: {metrics['Accuracy']:.4f}")
    print(f"False Negatives: {metrics['False Negatives']}")
    print(f"Confusion Matrix:\n{metrics['Confusion Matrix']}")

    # Track the best model
    if metrics['Accuracy'] > best_accuracy:
        best_accuracy = metrics['Accuracy']
        best_model_name = model

# Output the best model
print(f"\nBest Model: {best_model_name}")
print(f"Best Accuracy: {best_accuracy:.4f}")

Model Comparison Results:

Model: Basic Perceptron
Accuracy: 0.6897
False Negatives: 9
Confusion Matrix:
[[11 27]
 [ 9 69]]

Model: MLP
Accuracy: 0.7328
False Negatives: 11
Confusion Matrix:
[[18 20]
 [11 67]]

Model: Perceptron No Bias
Accuracy: 0.5776
False Negatives: 30
Confusion Matrix:
[[19 19]
 [30 48]]

Model: Perceptron Updated Bias
Accuracy: 0.6293
False Negatives: 22
Confusion Matrix:
[[17 21]
 [22 56]]

Model: Perceptron ReLU
Accuracy: 0.7672
False Negatives: 7
Confusion Matrix:
[[18 20]
 [ 7 71]]

Model: Perceptron Sigmoid
Accuracy: 0.7759
False Negatives: 7
Confusion Matrix:
[[19 19]
 [ 7 71]]

Best Model: Perceptron Sigmoid
Best Accuracy: 0.7759


# Classification Report

For the best-performing model, a detailed classification report is printed using scikit-learn's classification_report. This includes precision, recall, F1-score, and support for each class.


In [18]:
# Classification report for the best model
if best_model_name in models:
    best_model = models[best_model_name]
    y_pred_best = best_model.predict(X_test_scaled)

    print(f"\nClassification Report for {best_model_name}:\n")
    print(classification_report(y_test, y_pred_best))


Classification Report for Perceptron Sigmoid:

              precision    recall  f1-score   support

         0.0       0.72      0.58      0.64        45
         1.0       0.76      0.86      0.81        71

    accuracy                           0.75       116
   macro avg       0.74      0.72      0.72       116
weighted avg       0.75      0.75      0.74       116



# Key aspects of the custom perceptron class

- Flexible Activation Functions makes the CustomPerceptron versatile. Each activation function brings unique benefits:

  - The step function makes it model suitable for binary classification tasks.
  - The sigmoid function enables smooth decision boundaries and probabilistic outputs.
  - ReLU provides flexibility for more complex models, often used in deeper networks.

  We checked how these conditions affect the performance of the model.

- Bias Handling: The model can be configured with or without bias. Including bias gives the perceptron more flexibility, allowing it to better fit datasets where the decision boundary doesn't pass through the origin.

- Gradient-Free Weight Update: The perceptron updates its weights using a simple rule based on prediction errors. This makes it computationally less intensive and easier to understand, though it limits the model to linearly separable problems.

- Thresholding Mechanism: After applying the activation function, the perceptron outputs continuous values (for sigmoid or ReLU), but the final classification is binary. So a conversion is done to make it binary and then the threshold is applied. A threshold of 0.5 is applied in the customised perceptron