<a href="https://colab.research.google.com/github/marcelounb/Deep_Learning_with_python_JasonBrownlee/blob/master/12_1_Boston_House_Price_Dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project: Regression Of Boston House Prices

The problem that we will look at in this tutorial is the Boston house price dataset. The dataset describes properties of houses in Boston suburbs and is concerned with modeling the price of houses in those suburbs in thousands of dollars. 

As such, this is a **regression predictive modeling problem.** There are 13 input variables that describe the properties of a given Boston suburb. The full list of attributes in this dataset are as follows:
1. CRIM: per capita crime rate by town.
2. ZN: proportion of residential land zoned for lots over 25,000 sq.ft.
3. INDUS: proportion of non-retail business acres per town.
4. CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise).
5. NOX: nitric oxides concentration (parts per 10 million).
6. RM: average number of rooms per dwelling.
7. AGE: proportion of owner-occupied units built prior to 1940.
8. DIS: weighted distances to ﬁve Boston employment centers.
9. RAD: index of accessibility to radial highways.
10. TAX: full-value property-tax rate per 10,000.
11. PTRATIO: pupil-teacher ratio by town. 
12. B: 1000(Bk0.63)2 where Bk is the proportion of blacks by town. 
13. LSTAT: % lower status of the population.
14. MEDV: Median value of owner-occupied homes in 1000s.


This is a well studied problem in machine learning. It is convenient to work with because all of the input and output attributes are numerical and there are 506 instances to work with.

The dataset is available in the bundle of source code provided with this book. Alternatively, you can download this dataset and save it to your current working directly with the ﬁle name housing.csv1. 

**Reasonable performance for models evaluated** using Mean Squared Error (MSE) are around 20 in squared thousands of dollars (or 4,500 if you take the square root). This is a nice target to aim for with our neural network model. 


In [3]:
import numpy as np 
import pandas as pd 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.wrappers.scikit_learn import KerasRegressor 
from sklearn.model_selection import cross_val_score 
from sklearn.model_selection import KFold 
from sklearn.preprocessing import StandardScaler 
from sklearn.pipeline import Pipeline

Using TensorFlow backend.


In [0]:
# load dataset 
dataframe = pd.read_csv("/content/housing.csv", delim_whitespace=True, header=None) 
dataset = dataframe.values 
# split into input (X) and output (Y) variables 
X = dataset[:,0:13] 
Y = dataset[:,13]

In [6]:
X[0]

array([6.320e-03, 1.800e+01, 2.310e+00, 0.000e+00, 5.380e-01, 6.575e+00,
       6.520e+01, 4.090e+00, 1.000e+00, 2.960e+02, 1.530e+01, 3.969e+02,
       4.980e+00])

In [7]:
print(X.shape, Y.shape)

(506, 13) (506,)


# Develop a Baseline Neural Network Model

We can create Keras models and evaluate them with scikit-learn by using handy wrapper objects provided by the Keras library. This is desirable, because scikit-learn excels at evaluating models and will allow us to use powerful data preparation and model evaluation schemes with very few lines of code. 

The Keras wrapper class require a **function** as an argument. This function that we must deﬁne is responsible for creating the neural network model to be evaluated. 

Below we **deﬁne the function to create the baseline model to be evaluated.** It is a simple model that has a single fully connected hidden layer with the same number of neurons as input attributes (13). 

The network uses good practices such as the **rectiﬁer** activation function for the hidden layer. No activation function is used for the output layer because it is a regression problem and we are interested in predicting numerical values directly without transform. 

The efficient ADAM optimization algorithm is used and a mean squared error loss function is optimized. This will be the same metric that we will use to evaluate the performance of the model. It is a desirable metric because by taking the square root of an error value it gives us a result that we can directly understand in the context of the problem with the units in thousands of dollars.


In [0]:
# define base mode 
def baseline_model(): 
  # create model 
  model = Sequential() 
  model.add(Dense(13, input_dim=13, kernel_initializer= 'normal' , activation= 'relu' )) 
  model.add(Dense(1, init= 'normal' )) 
  # Compile model 
  model.compile(loss= 'mean_squared_error' , optimizer= 'adam' ) 
  return model

The Keras wrapper object for use in scikit-learn as a regression estimator is called KerasRegressor. We create an instance and pass it both the name of the function to create the neural network model as well as some parameters to pass along to the fit() function of the model later, such as the number of epochs and batch size. Both of these are set to sensible defaults. 

We also initialize the random number generator with a constant random seed, a process we will repeat for each model evaluated in this tutorial. This is to ensure we compare models consistently and that the results are reproducible.


In [0]:
# fix random seed for reproducibility 
seed = 7 
np.random.seed(seed)
# evaluate model with standardized dataset 
estimator = KerasRegressor(build_fn=baseline_model, epochs=100, batch_size=5)  # ,verbose=0

The ﬁnal step is to evaluate this baseline model. We will use 10-fold cross validation to evaluate the model.

In [12]:
kfold = KFold(n_splits=10, random_state=seed) 
results = cross_val_score(estimator, X, Y, cv=kfold) 
print("Baseline: %.2f (%.2f) MSE" % (results.mean(), results.std()))

  


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Running this code gives us an estimate of the model’s performance on the problem for unseen data. The result reports the mean squared error including the average and standard deviation (average variance) across all 10 folds of the cross validation evaluation.