# SENG 474 A02: Assignment 1
## Neural Networks 
Sean McAuliffe, V00913346  
February 3, 2023

---

We will use the suggested network architecture of 3 layers (one input layer, one hidden layer, and one output layer). The input layer will have 104 nodes, each representing a 
feature of the input feature vector.

We will experiment with different numbers of nodes in the hidden layer, and we will also experiment with the number of output nodes. One option is to have two nodes, one for each output classification (0, or 1), another possibility is 
to use a single output node which is meant to represent the magnitude of the predicted sample income. Magnitudes greather than 0.5 will be taken to be a prediction of 1, and magnitudes less than 0.5 will be taken to be a prediction of 0.

We will also experiment with different activation functions for the hidden layer, and we will also experiment with different activation functions for the output layer. We will also experiment with different loss functions, and we will also experiment with different optimizers, as well as other training hyperparameters.


---

## Environment Setup

In [27]:
# !pip3 install numpy
# !pip3 install pandas
# !pip3 install sklearn
# !pip3 install matplotlib
# !pip3 install graphviz

import numpy as np
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import GridSearchCV
import warnings
from sklearn.exceptions import ConvergenceWarning

---

## Data Preprocessing
Here the data is preprocessed according to the recommendation in Appendix A of the assignment specification. Specifically, for each feature of the training data, we normalize the feature to ensure that its minimum value is 0 and its maximum value is 1. We apply the same scaling to all datapoints for that feature. We also apply the same scaling to the test data.

In [28]:
income_dataset = np.genfromtxt('./cleaned_adult.csv', delimiter=',', skip_header=1, dtype=float)

def split_dataset(training_percent):
    """ Split the dataset into training and testing sets """
    np.random.shuffle(income_dataset)
    features = income_dataset[:, :-1]
    labels = income_dataset[:, -1]
    # https://www.statology.org/normalize-data-between-0-and-1/
    # For each feature, normalize the data to values are between 0 and 1
    for i in range(features.shape[1]):
        features[:, i] =  (features[:, i] - features[:, i].min()) / (features[:, i].max() - features[:, i].min())
    training_features = features[:int(training_percent * features.shape[0])]
    training_labels = labels[:int(training_percent * labels.shape[0])]
    testing_features = features[int(training_percent * features.shape[0]):]
    testing_labels = labels[int(training_percent * labels.shape[0]):]
    return training_features, training_labels, testing_features, testing_labels

# Extract the featrue names from the first row of the dataset
feature_names = np.genfromtxt('./cleaned_adult.csv',
                                delimiter=',',
                                max_rows=1,
                                dtype=str)[:-1]
label_names = ['poor', 'rich']

print("Data ready for retrieval via split_dataset function...")


Data ready for retrieval via split_dataset function...


---
## Sample Neural Network Code

In [13]:
# This example shows the construction of a neural network classifier, listing all
# of the sklearn MLP parameters and their default values.

mlpc = MLPClassifier(hidden_layer_sizes=(100,),
                    activation='relu',
                    solver='adam',
                    alpha=0.0001,
                    batch_size='auto',
                    learning_rate='constant',
                    learning_rate_init=0.001,
                    power_t=0.5,
                    max_iter=200,
                    shuffle=True,
                    random_state=None,
                    tol=0.0001,
                    verbose=False,
                    warm_start=False,
                    momentum=0.9,
                    nesterovs_momentum=True,
                    early_stopping=False,
                    validation_fraction=0.1,
                    beta_1=0.9,
                    beta_2=0.999,
                    epsilon=1e-08,
                    n_iter_no_change=10,
                    max_fun=15000)

X_train, y_train, X_test, y_test = split_dataset(0.8)
mlpc.fit(X_train, y_train)

print("Testing Accuracy: ", mlpc.score(X_test, y_test))
print("Training Accuracy: ", mlpc.score(X_train, y_train))


Testing Accuracy:  0.8438916528468767
Training Accuracy:  0.8796196478425519




---

## Experiment 0: Reducing Training Runtime
In this experiment, we will modify some of the default training parameters in order to attempt to reduce the MLP training time, while still maintaining an accuracy close to  that achieved in the sample code above. This will become our starting point for future hyperparameter tuning experiments, in which we will have to train many models.

In [32]:
# Experiment 0

# This example shows the construction of a neural network classifier, listing all
# of the sklearn MLP parameters and their default values.

warnings.filterwarnings("ignore", category=ConvergenceWarning)

mlpc = MLPClassifier(hidden_layer_sizes=(50,10, 10),
                    activation='relu',
                    solver='sgd', # stochastic gradient descent
                    alpha=0.0001,
                    batch_size='auto',
                    learning_rate='adaptive', # to prevent overshooting
                    learning_rate_init=0.001,
                    power_t=0.5,
                    max_iter=500,
                    shuffle=True,
                    random_state=None,
                    tol=0.0001,
                    verbose=True,
                    warm_start=False,
                    momentum=0.9,
                    nesterovs_momentum=True,
                    early_stopping=False,
                    validation_fraction=0.1,
                    beta_1=0.9,
                    beta_2=0.999,
                    epsilon=1e-08,
                    n_iter_no_change=10,
                    max_fun=15000)

# Split the dataset into training and testing sets
X_train, y_train, X_test, y_test = split_dataset(0.8)
mlpc.fit(X_train, y_train)

print("Testing Accuracy: ", mlpc.score(X_test, y_test))
print("Training Accuracy: ", mlpc.score(X_train, y_train))


Iteration 1, loss = 0.56802122
Iteration 2, loss = 0.55475718
Iteration 3, loss = 0.54530954
Iteration 4, loss = 0.53238434
Iteration 5, loss = 0.51403914
Iteration 6, loss = 0.48819275
Iteration 7, loss = 0.46176576
Iteration 8, loss = 0.44090504
Iteration 9, loss = 0.42462860
Iteration 10, loss = 0.41238684
Iteration 11, loss = 0.40299076
Iteration 12, loss = 0.39593860
Iteration 13, loss = 0.39027046
Iteration 14, loss = 0.38591414
Iteration 15, loss = 0.38218817
Iteration 16, loss = 0.37901907
Iteration 17, loss = 0.37631674
Iteration 18, loss = 0.37389211
Iteration 19, loss = 0.37161677
Iteration 20, loss = 0.36948432
Iteration 21, loss = 0.36771491
Iteration 22, loss = 0.36588480
Iteration 23, loss = 0.36482703
Iteration 24, loss = 0.36324604
Iteration 25, loss = 0.36200230
Iteration 26, loss = 0.36068768
Iteration 27, loss = 0.35961982
Iteration 28, loss = 0.35861942
Iteration 29, loss = 0.35765130
Iteration 30, loss = 0.35657158
Iteration 31, loss = 0.35573884
Iteration 32, los