## Boston House Prices
In this tutorial we will create a neural network that predicts the median house prices of U.S. census tracts in Boston.

This type of supervised learning is called regression, because our desired output consists of one or more continuous variables.

### Import dependencies
Start by importing the dependencies we will need for the project

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
from uoa_mlaas import use_cpu
use_cpu()

### Set seed
Set a seed value so that when we repeatedly run our code we will get the same result. Using the same seed is important when you want to compare algorithms.

In [None]:
seed = 7
np.random.seed(seed)

### Import data
The Boston House Price's dataset contains 13 features and our target output, the median value of owner-occupied houses, in 506 U.S. census tracts from Boston. Each sample corresponds to a different census tract. The features in the dataset are described below.

* CRIM: per capita crime rate by town
* ZN: proportion of residential land zoned for lots over 25,000 sq.ft.
* INDUS: proportion of non-retail business acres per town.
* CHAS: Charles River dummy variable (1 if tract bounds river; 0 otherwise)
* NOX: nitric oxides concentration (parts per 10 million)
* RM: average number of rooms per dwelling
* AGE: proportion of owner-occupied units built prior to 1940
* DIS: weighted distances to five Boston employment centres
* RAD: index of accessibility to radial highways
* TAX: full-value property-tax rate per \$10,000
* PTRATIO: pupil-teacher ratio by town
* B: 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
* LSTAT: % lower status of the population
* MEDV: Median value of owner-occupied homes in \$1000's

A snapshot of the dataset is illustrated below.

|CRIM|ZN|INDUS|CHAS|NOX|RM|AGE|DIS|RAD|TAX|PTRATIO|B|LSTAT|MEDV|
|-------|--|----|-|-----|-----|----|----|-|---|----|-----|----|--|
|0.00632|18|2.31|0|0.538|6.575|65.2|4.09|1|296|15.3|396.9|4.98|24|
|0.02731|0|7.07|0|0.469|6.421|78.9|4.9671|2|242|17.8|396.9|9.14|21.6|
|0.02729|0|7.07|0|0.469|7.185|61.1|4.9671|2|242|17.8|392.83|4.03|34.7|
|0.03237|0|2.18|0|0.458|6.998|45.8|6.0622|3|222|18.7|394.63|2.94|33.4|


To load this data into memory, use the `np.loadtxt` function.

In [None]:
data = np.loadtxt('data/housing.csv', delimiter=',')

Separate the data into input (X) and output (y) variables.

In [None]:
X = data[:, 0:13]
y = data[:, 13]

Like the previous two tutorials, use the `train_test_split` function from scikit-learn to split the input and target data into training and test datasets.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=seed)

### Create the model
The code snippet below creates a very basic neural network model, with three layers: an input layer, a hidden layer and an output layer.

The first layer is a fully connected `Dense` layer. It has 13 input neurons (one for each feature) and 13 neurons in the hidden layer.

The last layer has 1 neuron, because we are just estimating one output, the median house price. If not activation function is specified, then a linear activation function is used.

In [None]:
model = Sequential()
model.add(Dense(13, input_dim=13, activation='relu', kernel_initializer='normal'))
model.add(Dense(1, kernel_initializer='normal'))

### Compile the model
We then compile the model, using the `mean_squared_error` loss function. The mean squared error is often used with regression problems.

In [None]:
model.compile(loss='mean_squared_error', optimizer='adam')

### Fit the model
Now that we have compiled the model, we can train it with the data we prepared earlier.

In [None]:
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=100, batch_size=5)

### Evaluate the model
Now that we have trained our model, we can evaluate the performance on the test data.

In [None]:
scores = model.evaluate(X_test, y_test)
print("\n\nMean squared error: {0:.2f}".format(scores))