# Regression Neural Network 

This note is an attempt to implement nerual networks with Keras to study a regreesion problem. There are 13 numerical properties of houses in Boston suburbs and it is concerned with modeling the price of houses in those suburbs in thousands of dollars. This note follows the Keras [tutorial blog](http://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/).

In [167]:
import keras
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

In [185]:
print (keras.__version__)

1.2.0


## 0. Data Source

[Boston house price dataset.](https://archive.ics.uci.edu/ml/datasets/Housing) The metadata is as follows. The last one, **MEDV** the medican house price, is our label (i.e. Y).

### MetaData
1. CRIM: per capita crime rate by town 
2. ZN: proportion of residential land zoned for lots over 25,000 sq.ft. 
3. INDUS: proportion of non-retail business acres per town 
4. CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) 
5. NOX: nitric oxides concentration (parts per 10 million) 
6. RM: average number of rooms per dwelling 
7. AGE: proportion of owner-occupied units built prior to 1940 
8. DIS: weighted distances to five Boston employment centres 
9. RAD: index of accessibility to radial highways 
10. TAX: full-value property-tax rate per 10,000 dollars
11. PTRATIO: pupil-teacher ratio by town 
12. B: 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town 
13. LSTAT: % lower status of the population 
14. MEDV: Median value of owner-occupied homes in $1000's

### Fetch data from csv file

**NOTE: in the csv file there are no headers. Therefore, while reading, we need to remark "header=None"; otherwise the data will miss the first observation.**

In [77]:
data = pd.read_csv("~/Desktop/housing.csv", delim_whitespace=True, header=None)

In [78]:
data.columns = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 
                'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']

In [6]:
data.shape

(506, 14)

In [32]:
data.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,MEDV
0,0.00632,18.0,2.31,0,0.538,6.575,65.2,4.09,1,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0,0.469,6.421,78.9,4.9671,2,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0,0.469,7.185,61.1,4.9671,2,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0,0.458,6.998,45.8,6.0622,3,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0,0.458,7.147,54.2,6.0622,3,222.0,18.7,396.9,5.33,36.2


In [222]:
x_features = list(data.columns)
x_features.remove('MEDV')
X = data.loc[:, x_features]
Y = data['MEDV']

In [223]:
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2);

In [224]:
X_train.shape, X_test.shape

((404, 13), (102, 13))

### NOTE: Keras only accepts numpy.array as input

Therefore after reading in pandas dataframe, we need to convert to numpy array. See the blog post in [Stackoverflow](https://stackoverflow.com/questions/35968973/keras-indexerror-indices-are-out-of-bounds/36798335#36798335)

In [178]:
X_train, Y_train = X_train.values, Y_train.values
X_test, Y_test = X_test.values, Y_test.values
print (X_train.shape, X_test.shape)

(404, 13) (102, 13)


## 1. Baseline Model: 

### input(13) - hidden(10) - output(1)

The baseline model is simply considered as an input-hidden-output network. Since there are 13 attributes, there must be 13 neurons in the input layer. Followed that, there is a hidden layer with 10 neurons using **relu** activation function. Since this is a regression problem, the outcome is continuous and we only need one neuron without any activation functions in the output layer.

We optimize our regression neural network using **ADAM** for optimization algorithm and choose **mean squared error** for  loss function.

Note that in the tutorial, the author used "kernel_initializer = 'normal'" to initialize the network parameters. Keras v1.x however does not have such parameters. Instead, the parameters should be initialized using "init='normal'". See the [stackoverflow discussion](https://stackoverflow.com/questions/43556979/typeerror-keyword-argument-not-understood-padding/43557277).

In [209]:
def baseline_model():
    model = Sequential()
    model.add(Dense(10, input_dim=13, init='normal', activation='relu'))
    model.add(Dense(1, init='normal'))
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model

In [210]:
estimator = KerasRegressor(build_fn=baseline_model, nb_epoch=100, batch_size=5, verbose=0)

In [211]:
seed = 7
np.random.seed(seed)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(estimator, X_train, Y_train, cv=kfold)
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Results: 25.42 (7.52) MSE


With 10-fold cross validation, the average MSE (mean squared error) is 25.42 +/- 7.52.

## 2. Regression with Standardized Data

In [201]:
from sklearn import preprocessing
scaler = preprocessing.StandardScaler().fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [202]:
X_train_scaled.mean(axis=0), X_train_scaled.std(axis=0)

(array([ -3.73738444e-17,  -3.51753830e-17,  -1.89067683e-16,
         -4.61676901e-17,   4.06715365e-16,  -3.06685370e-16,
          2.04456913e-16,  -1.31907686e-17,   6.15569202e-17,
         -1.71479992e-16,   1.08384149e-15,   3.05586139e-16,
          3.60547675e-16]),
 array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.]))

In [212]:
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(estimator, X_train_scaled, Y_train, cv=kfold)
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Results: 16.87 (8.09) MSE


After data standardization, the MSE decreases. Thus the model is better. The average MSE now is 16.87 +/- 8.09.

## 3. Other Neural Networks

In the follwoing, we perform regression on the networks with more layers or more neurons.

### a. input(13) - hidden(10) - hidden(10) - output(1)

In [204]:
def model_1():
    model = Sequential()
    model.add(Dense(10, input_dim=13, init='normal', activation='relu'))
    model.add(Dense(10, init='normal', activation='relu'))
    model.add(Dense(1, init='normal'))
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model

In [205]:
estimator = KerasRegressor(build_fn=model_1, nb_epoch=100, batch_size=5, verbose=0)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(estimator, X_train_scaled, Y_train, cv=kfold)
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Results: 14.91 (7.97) MSE


The average MSE is 14.91 +/- 7.97.

### b. input(13) - hidden(20) - output(1)

In [206]:
def model_2():
    model = Sequential()
    model.add(Dense(20, input_dim=13, init='normal', activation='relu'))
    model.add(Dense(1, init='normal'))
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model

In [208]:
estimator = KerasRegressor(build_fn=model_2, nb_epoch=100, batch_size=5, verbose=0)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(estimator, X_train_scaled, Y_train, cv=kfold)
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Results: 14.50 (7.62) MSE


The average MSE is 14.50 +/- 7.62.

## 4. Model Prediction

From the above comparison, the model with more neurons has the best performance. Thus we use it to test our test dataset:

In [218]:
best_model = KerasRegressor(build_fn=model_2, nb_epoch=100, batch_size=5, verbose=0)
best_model.fit(X_train_scaled, Y_train)
predictedY = best_model.predict(X_test_scaled)
print (best_model.score(X_test_scaled, Y_test))
print (predictedY[:5])

12.9430561112
[ 28.6109066   27.84487534  21.2660141   31.34295845  45.72811127]


In [220]:
print (len(predictedY), X_test_scaled.shape)

102 (102, 13)


1. [Regression Tutorial with the Keras Deep Learning Library in Python](http://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/)