$\newcommand{\xv}{\mathbf{x}}
\newcommand{\Xv}{\mathbf{X}}
\newcommand{\yv}{\mathbf{y}}
\newcommand{\zv}{\mathbf{z}}
\newcommand{\av}{\mathbf{a}}
\newcommand{\Wv}{\mathbf{W}}
\newcommand{\wv}{\mathbf{w}}
\newcommand{\tv}{\mathbf{t}}
\newcommand{\Tv}{\mathbf{T}}
\newcommand{\muv}{\boldsymbol{\mu}}
\newcommand{\sigmav}{\boldsymbol{\sigma}}
\newcommand{\phiv}{\boldsymbol{\phi}}
\newcommand{\Phiv}{\boldsymbol{\Phi}}
\newcommand{\Sigmav}{\boldsymbol{\Sigma}}
\newcommand{\Lambdav}{\boldsymbol{\Lambda}}
\newcommand{\half}{\frac{1}{2}}
\newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}}
\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}}$

# Assignment 6: Neural Networks

Michael Johnesee

## Overview

This notebook is designed to train neural networks using various configurations of hidden layers and units in each hidden layer.  The training will be used to compare results to choose the best hidden layer pattern.  This will be done for a regression network and a classification network.

## Required Code

The following python files are utilized by this network and will be required to be downloaded in the same folder.

Download [nn2.tar](http://www.cs.colostate.edu/~anderson/cs440/notebooks/nn2.tar) 

Files are provided by Chuck Anderson for the purpose of this assignment

* `neuralnetworks.py`
* `scaledconjugategradient.py`
* `mlutils.py`

## Required functions

`trainNNs(X, T, trainFraction, hiddenLayerStructures, numberRepetitions, numberIterations, classify)`

The arguments to `trainNNs` are

* `X` is a matrix of input data of shape `nSamples x nFeatures`
* `T` is a matrix of target data of shape `nSamples x nOutputs`
* `trainFraction` is fraction of samples to use as training data. 1-`trainFraction` is number of samples for testing data
* `hiddenLayerStructures` is list of network architectures. For example, to test two networks, one with one hidden layer of 20 units, and one with 3 hidden layers with 5, 10, and 20 units in each layer, this argument would be `[[20], [5, 10, 20]]`.
* `numberRepetitions` is number of times to train a neural network.  Calculate training and testing average performance (two separate averages) of this many training runs.
* `numberIterations` is the number of iterations to run the scaled conjugate gradient algorithm when a neural network is trained.
* `classify` is set to `True` if you are doing a classification problem, in which case `T` must be a single column of target class integers.

This function returns `results` which is list with one element for each network structure tested.  Each element is a list containing 

* the hidden layer structure (as a list)
* a list of training data performance for each repetition
* a list of testing data performance for each repetition
* the number of seconds it took to run this many repetitions for this network structure.

`summarize(results)`

This function returns `summary`, which modifies `results` but with the list of training performances and the list of testing performances each replaced by their mean. 

* the hidden layer structure (as a list)
* the mean of the training data performance
* the mean of the testing data performance
* the number of seconds it took to run this many repetitions for this network structure.

`bestNetwork(summary)`

This function returns the best element of `summary`, which is the element that has the smallest test performance.

* the hidden layer structure (as a list)
* the mean of the training data performance
* the mean of the testing data performance for
* the number of seconds it took to run this many repetitions for this network structure.



### Import Statements

In [None]:
import mlutils as ml
import numpy as np
import scaledconjugategradient as scg
import neuralnetworks as nn
from copy import copy
import copy
import pandas as pd
import time
import matplotlib.pyplot as plt
%matplotlib inline

### Required Function Code

In [None]:
def trainNNs(X, T, trainFraction, hiddenLayerStructures, numberRepetitions, numberIterations, classify=False):
    results = []

    for hidden in hiddenLayerStructures:
        start_time = time.time()
        train_data = []
        test_data = []
        for i in range(numberRepetitions):
            
            x_train, t_train, x_test, t_test = ml.partition(X, T, (trainFraction, 1 - trainFraction), classify)
                       
            if classify is True:
                nnet = nn.NeuralNetworkClassifier(X.shape[1], hidden, len(np.unique(T)))
                nnet.train(x_train, t_train, nIterations=numberIterations)
                y_train, z_train, _ = nnet.use(x_train, allOutputs=True)
                y_test, z_test, _ = nnet.use(x_test, allOutputs=True)
                train_data.append(np.sum(y_train != t_train) / t_train.shape[0])
                test_data.append(np.sum(y_test != t_test) / t_test.shape[0])

            else:
                nnet = nn.NeuralNetwork(x_train.shape[1], hidden, t_train.shape[1])
                nnet.train(x_train, t_train, nIterations=numberIterations)
                y_train, z_train = nnet.use(x_train,  allOutputs=True)
                y_test, z_test = nnet.use(x_test,  allOutputs=True)
                train_data.append(np.sqrt(np.mean((y_train - t_train)**2)))
                test_data.append(np.sqrt(np.mean((y_test - t_test)**2)))

        results.append([hidden, train_data, test_data, time.time() - start_time])

    return results


def summarize(results):
    summary = []
    for result in results:
        summary.append([result[0], np.mean(result[1]), np.mean(result[2]), result[3]])
    return summary


def bestNetwork(summary):
    best = summary[0]
    for index in summary:
        if index[2] < best[2]:
            best = index
    return best

## Simple Examples


In [None]:
X = np.arange(10).reshape((-1,1))
T = X + 1 + np.random.uniform(-1, 1, ((10,1)))

In [None]:
plt.plot(X, T, 'o-');

In [None]:
nnet = nn.NeuralNetwork(X.shape[1], 2, T.shape[1])
nnet.train(X, T, 100)
nnet.getErrorTrace()

In [None]:
nnet = nn.NeuralNetwork(X.shape[1], [5, 5, 5], T.shape[1])
nnet.train(X, T, 100)
nnet.getErrorTrace()

In [None]:
results = trainNNs(X, T, 0.8, [2, 10, [10, 10]], 5, 100, classify=False)
results

In [None]:
results = trainNNs(X, T, 0.8, [0, 1, 2, 10, [10, 10], [5, 5, 5, 5], [2]*5], 50, 400, classify=False)

In [None]:
summarize(results)

In [None]:
best = bestNetwork(summarize(results))
print(best)
print('Hidden Layers {} Average RMSE Training {:.2f} Testing {:.2f} Took {:.2f} seconds'.format(*best))

A neural net with no hidden layers does best on this simple data set. 

## Data for Regression Experiment

The following section uses data from the UCI Machine Learning Repository.

Download [Appliances energy prediction](http://archive.ics.uci.edu/ml/datasets/Appliances+energy+prediction) 

  You can do this by visiting the Data Folder for this data set, or just do this:

     !wget http://archive.ics.uci.edu/ml/machine-learning-databases/00374/energydata_complete.csv

The following function will read the data into a dataframe and remove unnecissary columns. 

In [None]:
def energy_df_read_csv(address):
    energy_df = pd.read_csv(address)
    energy_df.drop(['date', 'rv1', 'rv2'], axis=1, inplace=True)
    return energy_df

def energy_df_reshape(energy_df):
    t_energy_df = energy_df.iloc[:, 0:2]
    x_energy_df = energy_df.iloc[:, 2:]
    return x_energy_df, t_energy_df

In [None]:
energy_df = energy_df_read_csv('energydata_complete.csv')
energy_df

The code below transformes the data into two dataframes for training. The first two columns, labelled Appliances and lights as the target variables, and the remaining 24 columns are the input features.

In [None]:
x_energy_df, t_energy_df = energy_df_reshape(energy_df)
x_energy_df 
t_energy_df

Train for several neural networks on the data for 100 iterations and plot the error trace (nnet.getErrorTrace()).  100 may not be enough.  If for your larger networks the error is still decreasing after 100 iterations you should train all nets for more than 100 iterations.

Now use your `trainNNs`, `summarize`, and `bestNetwork` functions on this data to investigate various network sizes.

Converting the dataframes to numpy arrays for easier manipulation in trainNNs

In [None]:
x_energy_np = x_energy_df.values
t_energy_np = t_energy_df.values

In [None]:
trainNNs(x_energy_np, t_energy_np, 0.8, [0], 10, 100, False)
plt.plot(nnet.getErrorTrace())
plt.show()
plt.gcf().clear()

In [None]:
trainNNs(x_energy_np, t_energy_np, 0.8, [5], 10, 100, False)
plt.plot(nnet.getErrorTrace())
plt.show()
plt.gcf().clear()

In [None]:
trainNNs(x_energy_np, t_energy_np, 0.8, [5, 5], 10, 100, False)
plt.plot(nnet.getErrorTrace())
plt.show()
plt.gcf().clear()

In [None]:
trainNNs(x_energy_np, t_energy_np, 0.8, [10, 10], 10, 100, False)
plt.plot(nnet.getErrorTrace())
plt.show()
plt.gcf().clear()

In [None]:
results = trainNNs(x_energy_np, t_energy_np, 0.8, [0, 5, [5, 5], [10, 10]], 10, 200)

In [None]:
summarize(results)

In [None]:
bestNetwork(summarize(results))

Test at least 10 different hidden layer structures.  Larger numbers of layers and units may do the best on training data, but not on testing data.

In [None]:
hidden_layers = [[0], [2], [5], [10], [2]*2, [5]*2, [10]*2, [2]*5, [5]*5, [10]*5, [2]*10, [5]*10]
results = trainNNs(x_energy_np, t_energy_np, 0.8, hidden_layers, 10, 200, False)

In [None]:
summarize(results)

In [None]:
bestNetwork(summarize(results))

In [None]:
best_hidden = bestNetwork(summarize(results))
best_hidden

Now train another network with your best hidden layer structure on 0.8 of the data and use the trained network on the testing data.

In [None]:
x_train_np, t_train_np, x_test_np, t_test_np = ml.partition(x_energy_np, t_energy_np, (0.8, 0.2), False)
better_nnet = nn.NeuralNetwork(x_energy_np.shape[1], best_hidden[0], t_energy_np.shape[1])
better_nnet.train(x_train_np, t_train_np, 200)
better_test, z_test = better_nnet.use(x_test_np, allOutputs=True)
better_train, z_train = better_nnet.use(x_train_np, allOutputs=True)

For the testing data, plot the predicted and actual `Appliances` energy use, and the predicted and actual `lights` energy use, in two separate plots. 

In [None]:
plt.subplot(nPlotRows, 1, 3)
plt.plot(x_test_np, t_test_np, 'o-', label='Test Target')
plt.plot(x_test_np, better_test, 'o-', label='Test NN Output')
plt.ylabel('Testing Data')\
plt.show()
plt.gcf().clear()

## Data for Classification Experiment

The following section uses data from the UCI Machine Learning Repository.

Download [Anuran Calls (MFCCs)](http://archive.ics.uci.edu/ml/datasets/Anuran+Calls+%28MFCCs%29)|

You can do this by visiting the Data Folder for this data set, or just do this:

     !wget 'http://archive.ics.uci.edu/ml/machine-learning-databases/00406/Anuran Calls (MFCCs).zip'
     !unzip Anuran*zip

The following function will read the data into a dataframe and remove unnecissary columns. In addition, it will convert unique species into a number.

In [None]:
def frogs_df_read_csv(address):
    frogs = pd.read_csv(address)
    frogs.drop(['MFCCs_ 1','Family', 'Genus', 'RecordID'], axis=1, inplace=True)
    frogs['Species'] = pd.factorize(frogs.Species)[0]
    return frogs


def frogs_df_reshape(frogs):
    Tanuran = frogs.iloc[:, -1:]
    Xanuran = frogs.iloc[:, :-1]
    print(Xanuran.shape, Tanuran.shape)
    return Xanuran, Tanuran

In [None]:
frogs_df = frogs_df_read_csv('Frogs_MFCCs.csv')
x_frogs_df, t_frogs_df = frogs_df_reshape(frogs_df)

In [None]:
x_frogs_np = x_frogs_df.values
t_frogs_np = t_frogs_df.values
x_frogs_np[:2,:]

In [None]:
t_frogs_np[:2]

In [None]:
for i in range(10):
    print('{} samples in class {}'.format(np.sum(t_frogs_np==i), i))

In [None]:
results = trainNNs(x_frogs_np, t_frogs_np, 0.8, [0, 5, [5, 5]], 5, 200, classify=True)

In [None]:
summarize(results)

In [None]:
bestNetwork(summarize(results))

Similar investigation to the regression data. 

In [None]:
hidden_layers = [[0], [2], [5], [10], [2]*2, [5]*2, [10]*2, [2]*5, [5]*5, [10]*5, [2]*10, [5]*10]
results = trainNNs(x_frogs_np, t_frogs_np, 0.8, hidden_layers, 5, 200, classify=True)

In [None]:
summarize(results)

In [None]:
bestNetwork(summarize(results))

In [None]:
best_hidden = bestNetwork(summarize(results))
best_hidden

In [None]:
x_train_np, t_train_np, x_test_np, t_test_np = ml.partition(x_frogs_np, t_frogs_np, (0.8, 0.2), False)
better_nnet = nn.NeuralNetworkClassifier(X.shape[1], hidden, len(np.unique(t_frogs_np)))
better_nnet.train(x_train_np, t_train_np, 200)
better_train, z_train, _ = nnet.use(x_train_np, allOutputs=True)
better_test, z_test, _ = nnet.use(x_test_np, allOutputs=True)

Plot the predicted and actual Species for the testing data as an integer. 

In [None]:
plt.subplot(nPlotRows, 1, 3)
plt.plot(x_test_np, t_test_np, 'o-', label='Test Target')
plt.plot(x_test_np, better_test, 'o-', label='Test NN Output')
plt.ylabel('Testing Data')