# Appendix

## MLPClassifier - Wisconsin breast cancer dataset
Python code that splits the original Wisconsin breast cancer dataset into two subsets: training/validation (80%), and test (20%).

In [1]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import numpy as np

# Load the dataset
breast_cancer = load_breast_cancer()
X_breast_cancer = breast_cancer.data
y_breast_cancer = breast_cancer.target
breast_cancer_target_names = breast_cancer.target_names

# Dataset split: 68% training, 12% validation and 20% test.

# Setting 'random_state' to my any number, in this case my A number(20548919) makes sure that the splits are repeatable.
# X_tv and y_tv are temporary variables; tv stands for training and validation.
# As only one split can be done at a time, first let's split to 80% and 20%
X_tv, X_test, y_tv, y_test = train_test_split(X_breast_cancer, y_breast_cancer, test_size=0.2, random_state=20548919)

Python code that uses an additional split to create a validation dataset or Python code that implements a cross-validation approach to tune the MLP model hyperparameters. 

In [2]:
# Splitting the temporary set into training and validation sets (15% validation and the rest for training => 
# From entire dataset 12% validation and 68% training.
X_train, X_validation, y_train, y_validation = train_test_split(X_tv, y_tv, test_size=0.15, random_state=20548919)

# Sizes of each dataset (just to see if the split was done correctly)
print("Original size: ", X_breast_cancer.shape[0])
print("Training set size: ", X_train.shape[0], " (", np.round(100*X_train.shape[0]/X_breast_cancer.shape[0]), ('% )'))
print("Validation set size: ", X_validation.shape[0], " (", np.round(100*X_validation.shape[0]/X_breast_cancer.shape[0]), ('% )'))
print("Test set size: ", X_test.shape[0], " (", np.round(100*X_test.shape[0]/X_breast_cancer.shape[0]), ('% )'))

Original size:  569
Training set size:  386  ( 68.0 % )
Validation set size:  69  ( 12.0 % )
Test set size:  114  ( 20.0 % )


Python code that uses MLPClassifier to train, validate and test an MLP model

In [3]:
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import classification_report

# Training the model
# MLPClassifier is initialized with modified hyperparameters
mlp_classifier = MLPClassifier(hidden_layer_sizes=(100, 50), activation='relu', alpha=0.001, random_state=20548919, max_iter=125, solver="lbfgs")
mlp_classifier.fit(X_train, y_train)

# Score of the final model for each of the datasets: training, validation and test
# Training set
training_accuracy = mlp_classifier.score(X_train, y_train)
# Validation set
validation_accuracy = mlp_classifier.score(X_validation, y_validation)
# Test set
test_accuracy = mlp_classifier.score(X_test, y_test)
# Outputs
print("Training R^2 score: ", training_accuracy)
print("Validation R^2 score: ",validation_accuracy)
print("Test R^2 score: ", test_accuracy, "\n")

# Initialization of prediction of the model in the test dataset
y_test_prediction = mlp_classifier.predict(X_test)

# Classification report for the test set to see how well the model predicts the classes in the test data
print("Classification report:\n", "\n", classification_report(y_test, y_test_prediction, target_names = breast_cancer_target_names))

# Well predicted example
# Initialization
index_well = int(-1) 
# Loop and compare to see if the prediction is equal to the ground truth
for i in range(len(y_test)):
    ground_truth = y_test[i]
    prediction = y_test_prediction[i]
    if ground_truth == prediction:  
        # Save index if it is the same
        index_well = i
        # Stop looking for more (just one example needed)
        break

# Poorly predicted example
# Initialization
index_poor = int(-1) 
# Loop and compare to see if the prediction is different to the ground truth
for i in range(len(y_test)):
    ground_truth = y_test[i]
    prediction = y_test_prediction[i]
    # Check if the predicted value does NOT match the actual value
    if ground_truth != prediction:
        # Save index if it is different
        index_poor = i
        # Stop looking for more (just one example needed)
        break

print("Well-predicted example:", "\n", X_test[index_well], 
      "\n", "Ground truth (Actual label):", y_test[index_well], 
      "\n",  "Predicted label:", y_test_prediction[index_well],"\n")
print("Poorly-predicted example:", "\n", X_test[index_poor], "\n", 
      "Ground truth (Actual label):", y_test[index_poor], "\n", 
      "Predicted label:", y_test_prediction[index_poor])


Training R^2 score:  0.9455958549222798
Validation R^2 score:  0.9130434782608695
Test R^2 score:  0.9298245614035088 

Classification report:
 
               precision    recall  f1-score   support

   malignant       0.97      0.85      0.91        46
      benign       0.91      0.99      0.94        68

    accuracy                           0.93       114
   macro avg       0.94      0.92      0.93       114
weighted avg       0.93      0.93      0.93       114

Well-predicted example: 
 [2.425e+01 2.020e+01 1.662e+02 1.761e+03 1.447e-01 2.867e-01 4.268e-01
 2.012e-01 2.655e-01 6.877e-02 1.509e+00 3.120e+00 9.807e+00 2.330e+02
 2.333e-02 9.806e-02 1.278e-01 1.822e-02 4.547e-02 9.875e-03 2.602e+01
 2.399e+01 1.809e+02 2.073e+03 1.696e-01 4.244e-01 5.803e-01 2.248e-01
 3.222e-01 8.009e-02] 
 Ground truth (Actual label): 0 
 Predicted label: 0 

Poorly-predicted example: 
 [1.625e+01 1.951e+01 1.098e+02 8.158e+02 1.026e-01 1.893e-01 2.236e-01
 9.194e-02 2.151e-01 6.578e-02 3.147e-01

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  self.n_iter_ = _check_optimize_result("lbfgs", opt_res, self.max_iter)


## MLPRegressor - Diabetes dataset

The above process should be repeated for the Diabetes dataset using MLPRegressor. Shuffle and split the original dataset into training/validation (80%) and test (20%) sets. Be sure to use the "random_state" input, so we can recreate the same split when testing your code. Then, develop a documented process to determine a set of hyperparameters that do the best job of predicting the targets for the examples in the validation set. You may create a separate validation set or use cross-validation.

In [4]:
from sklearn.datasets import load_diabetes
from sklearn.neural_network import MLPRegressor
import numpy as np

# Loading the diabetes data
diabetes = load_diabetes()                                      
X_diabetes = diabetes.data                              # n samples
y_diabetes = diabetes.target                            # disease progression
diabetes_features_names = diabetes.feature_names        # name of the features

# Dataset split: 68% training, 12% validation and 20% test.

# Setting 'random_state' to my any number, in this case my A number(20548919) makes sure that the splits are repeatable.
# X_tv and y_tv are temporary variables; tv stands for training and validation.
# As only one split can be done at a time, first let's split to 80% and 20%
X_tv, X_test, y_tv, y_test = train_test_split(X_diabetes, y_diabetes, test_size=0.2, random_state=20548919)
# Splitting the temporary set into training and validation sets (15% validation and the rest for training => 
# From entire dataset 12% validation and 68% training.
X_train, X_validation, y_train, y_validation = train_test_split(X_tv, y_tv, test_size=0.15, random_state=20548919)
# Sizes of each dataset (just to see if the split was done correctly)
print("Original size: ", X_diabetes.shape[0])
print("Training set size: ", X_train.shape[0], " (", np.round(100*X_train.shape[0]/X_diabetes.shape[0]), ('% )'))
print("Validation set size: ", X_validation.shape[0], " (", np.round(100*X_validation.shape[0]/X_diabetes.shape[0]), ('% )'))
print("Test set size: ", X_test.shape[0], " (", np.round(100*X_test.shape[0]/X_diabetes.shape[0]), ('% )'))

Original size:  442
Training set size:  300  ( 68.0 % )
Validation set size:  53  ( 12.0 % )
Test set size:  89  ( 20.0 % )


In [5]:
from sklearn.model_selection import GridSearchCV

# Cross-validation
parameter_grid = {
    'hidden_layer_sizes': [(50,), (100,), (50, 50), (100, 50), (50, 100), (100, 100), (50, 50, 50)],
    'activation': ['relu', 'tanh'],
    'solver': ['adam', 'sgd'],
    'alpha': [0.0001, 0.001, 0.005, 0.01],
    'max_iter': [10000],
    'learning_rate_init': [0.01, 0.05, 0.1]
}
# Initialize regressor with random state
mlp_regressor = MLPRegressor(random_state = 20548919)
# Search for best performing set of hyperparameters
clf = GridSearchCV(mlp_regressor, parameter_grid, cv=5, n_jobs=-1)
clf.fit(X_train, y_train)
# Show the best hyperparameters found
print("From the search the best hyperparameters are: ", clf.best_params_)

# Score of the final model for each of the datasets: training, validation and test

# First the model with the best hyperparameters must be trained.
# As the best hyperparameters have been found and the validation dataset will not be used again,
# the model will be trained on the combined dataset of training and validation (best practises).
# best_estimator_ from GridSearchCV finds the model that performed best
best_hyperparameters_model = clf.best_estimator_
print("Training + Validation dataset size: ", X_tv.shape[0], " (", np.round(100*X_tv.shape[0]/X_diabetes.shape[0]), ('% )'))
best_hyperparameters_model.fit(X_tv, y_tv)
# Training set
training_accuracy = best_hyperparameters_model.score(X_train, y_train)
# Validation set
validation_accuracy = best_hyperparameters_model.score(X_validation, y_validation)
# Test set
test_accuracy = best_hyperparameters_model.score(X_test, y_test)
# Outputs
print("Training R^2 score: ", training_accuracy)
print("Validation R^2 score: ",validation_accuracy)
print("Test R^2 score: ", test_accuracy, "\n")

# Well predicted and poorly predicted examples
# Initialization
index_well = -1
index_poor = -1
min_error = float('inf')  # Initialize minimum error to a large value
max_error = -float('inf') # Initialize maximum error to a small value

y_test_prediction = best_hyperparameters_model.predict(X_test)

# Error margin for a well-predicted example (This is an arbitrary number)
error_margin = 5

# Loop to find well-predicted and poorly-predicted examples
for i in range(len(y_test)):
    ground_truth = y_test[i]
    prediction = y_test_prediction[i]
    error = abs(ground_truth - prediction)
    # Check for well-predicted example within the permissible error margin
    if error <= error_margin and index_well == -1:  # Also ensure only the first match is saved
        index_well = i
    # Check for the poorly-predicted example with the largest error
    if error > max_error:
        max_error = error
        index_poor = i

# Display the well-predicted example
print("Well-predicted example:", "\n",
      X_test[index_well], "\n",
      "Ground truth (Actual label):", y_test[index_well], "\n",
      "Predicted label:", y_test_prediction[index_well], "\n",
      "Error:", abs(y_test[index_well] - y_test_prediction[index_well]), "\n")

# Display the poorly-predicted example
print("Poorly-predicted example:", "\n",
      X_test[index_poor], "\n",
      "Ground truth (Actual label):", y_test[index_poor], "\n",
      "Predicted label:", y_test_prediction[index_poor], "\n",
      "Error:", abs(y_test[index_poor] - y_test_prediction[index_poor]))


  ret = a @ b
  return ((y_true - y_pred) ** 2).mean() / 2
  ret = a @ b
  numerator = (weight * (y_true - y_pred) ** 2).sum(axis=0, dtype=np.float64)
  ret = a @ b
  ret = a @ b
  ret = a @ b
  ret = a @ b
  ret = a @ b
  return ((y_true - y_pred) ** 2).mean() / 2
  ret = a @ b
  numerator = (weight * (y_true - y_pred) ** 2).sum(axis=0, dtype=np.float64)
  return ((y_true - y_pred) ** 2).mean() / 2
  ret = a @ b
  numerator = (weight * (y_true - y_pred) ** 2).sum(axis=0, dtype=np.float64)
  ret = a @ b
  ret = a @ b
  ret = a @ b
  return ((y_true - y_pred) ** 2).mean() / 2
  ret = a @ b
  numerator = (weight * (y_true - y_pred) ** 2).sum(axis=0, dtype=np.float64)
  ret = a @ b
  ret = a @ b
  ret = a @ b
  4.52149998e-001 -4.06257310e-002  4.43356878e-001 -3.38635612e-002
  4.37502572e-001 -3.60375908e+000  4.32706966e-001 -1.69660523e-002
  4.45090946e-001 -5.50255757e+004  4.37102790e-001 -5.17320820e+064
  3.98329783e-001 -1.43586817e+071  4.21479156e-001 -4.89667659e-001
  4.2143

From the search the best hyperparameters are:  {'activation': 'relu', 'alpha': 0.01, 'hidden_layer_sizes': (50, 50), 'learning_rate_init': 0.05, 'max_iter': 10000, 'solver': 'adam'}
Training + Validation dataset size:  353  ( 80.0 % )
Training R^2 score:  0.5769625411595452
Validation R^2 score:  0.5623033269742508
Test R^2 score:  0.5570637791269027 

Well-predicted example: 
 [ 0.04897352  0.05068012  0.08864151  0.08728655  0.03558177  0.02154596
 -0.02499266  0.03430886  0.06605067  0.13146972] 
 Ground truth (Actual label): 310.0 
 Predicted label: 309.3141682205302 
 Error: 0.6858317794698223 

Poorly-predicted example: 
 [ 0.04170844  0.05068012 -0.03854032  0.05285804  0.07686035  0.11642994
 -0.03971921  0.07120998 -0.02251653 -0.01350402] 
 Ground truth (Actual label): 253.0 
 Predicted label: 106.49909223620975 
 Error: 146.50090776379025
