# Neural Network Regression

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import sklearn as skl
import numpy as np

import seaborn as sns
sns.set(font_scale=2)

%matplotlib inline

In [None]:
import sklearn.decomposition
import sklearn.random_projection
import sklearn.neural_network
from sklearn.model_selection import train_test_split

Neural networks are a powerful and flexible tool for solving problems
ranging from
[data science]()
to
[artificial intelligence]().

The online book
[Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com/)
is a great resource for folks
with less background in mathematics and statistics,
while the online book
[Deep Learning](http://www.deeplearningbook.org/)
is more authoritative and thorough
for anyone comfortable with those fields.

If you'd just like to see some neural networks in action
on some visualizable datasets,
check out Google's
[Neural Network Playground](http://playground.tensorflow.org/).

The idea behind neural networks is simple:
we linearly transform our inputs,
$\mathbf{x}$,
using a matrix, $W$,
and then apply a nonlinear function $f$
to get values we call the
*activations*, $\mathbf{a}$:

$$
\mathbf{a} = f(W\mathbf{x})
$$

These activations can then be passed through another combination
of linear and nonlinear transformation,
as can those resulting activations.

At some point, we compare some final set of activations $\hat{y}$
to some target value $y$.

In [None]:
train = pd.read_csv('../data/training.csv')

train.head()

In [None]:
data_columns = [column for column in train.columns if column.startswith('m')]
wavenumbers = [float(column.lstrip('m')) for column in data_columns]

output_columns = ["Ca", "P", "pH", "SOC", "Sand"]

X = train[data_columns].as_matrix()
y = train[output_columns].as_matrix()

In [None]:
MLP_model = sklearn.neural_network.MLPRegressor(hidden_layer_sizes=(100,),
                                                max_iter=10000,tol=1e-16,
                                                early_stopping=True,
                                                batch_size=16)

In [None]:
transformed_X = sklearn.decomposition.PCA(n_components=100,whiten=False).fit_transform(X)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(transformed_X, y,
                                                    test_size=0.2,)

In [None]:
MLP_model.fit(X_train,y_train);

In [None]:
print(MLP_model.score(X_train,y_train))
print(MLP_model.score(X_test,y_test))

In [None]:
def MCWRMSE(y,y_hat):
    return np.mean(np.sqrt(np.mean(np.square(y-y_hat),axis=0)))

In [None]:
y_hat = MLP_model.predict(X_test)
MCWRMSE(y_test,y_hat)