# House Pricing Regression using Dense Neural Network (DNN) 

## Import Libraries

In [20]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

## Upload and Explore Dataset
[The Boston house-price data](http://lib.stat.cmu.edu/datasets/boston) 
* This is a dataset taken from the StatLib library which is maintained at Carnegie Mellon University.
* There are 506 samples, each one with 13 attributes (Features `Xi` from 0 to 12) of houses at different locations around the Boston suburbs in the late 1970s. The attributes themselves are defined in the StatLib website (as per capta crime rate in the area, number of rooms, distance  from employemment center, etc).
- Target (`Y`) is the median values of the houses at a location (in USD 1,000).

**Goal**
*  Our goal is to build a regression model that takes these **13 features as input** and **output a single value prediction** of the "median value of owner-occupied homes (in USD 1000)."
* Dataset can be download direct from: [tf.keras.datasets.boston_housing](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/boston_housing/load_data)


In [21]:
data = tf.keras.datasets.boston_housing

(x_train, y_train), (x_test, y_test) = data.load_data()

In [None]:
print(x_train.shape)
print(y_train.shape)

In [None]:
print(y_test.shape)
print(y_test.shape)

### Exploring Target (Y)

In [None]:
y_train

In [None]:
print('Min price in $K:  ',y_train.min())
print('Max price in $K:  ',round(y_train.mean(),2))
print('Mean price in $K: ',y_train.max())


In [None]:
plt.hist(y_train, label='train')
plt.hist(y_test, label = 'test')
plt.xlabel('Price in K$')
plt.legend();

In [None]:
y_train[0]

In [None]:
x_train[0]

### Exploring Input Features (X)

In [None]:
for i in range(len(x_train[0])):
  print("Feature {} ==> range from {} to {}".format(
      i, x_train[:,i].min(), x_train[:,i].max()
      )
  )

In [None]:
feature = 1
plt.hist(x_train[:,feature], label='train')
plt.hist(x_test[:,feature], label = 'test')
plt.legend();

In [None]:
print (x_train.max())
print (x_train.min())

### Preprocessing Data 

**Normalizing Data**: 
We notice that values range varies depending on the type of the feature. If we are training a neural network, for various reasons it's easier if we treat all values as between 0 and 1 (or at least with similar ranges), a process called 'normalizing'. In this case, all features will be `rescaled`.

The standard score of a sample `x` is calculated as:

        z = (x - u) / s

where `u` is the mean of the training samples or zero if `with_mean=False` and `s` is the standard deviation of the training samples or one if `with_std=False`.

In [32]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

# first we fit the scaler on the training dataset
scaler.fit(x_train)

# then we call the transform method to scale both the training and testing data
x_train_norm = scaler.transform(x_train)
x_test_norm = scaler.transform(x_test)

Another way do normalize data directly with numpy is:
- Get per-feature statistics (mean, standard deviation) from the training set to normalize by:
  - x_train_mean = np.mean(x_train, axis=0)
  - x_train_std = np.std(x_train, axis=0)
  - x_train_norm = (x_train - x_train_mean) / x_train_std

  - x_test_norm = (x_test - x_train_mean) / x_train_std

**Note** that the quantities used for normalizing the test data are computed using the training data. You should never use in your workflow any quantity computed on the test data, even for something as simple as data normalization.

In [None]:
print (x_train_norm.max())
print (x_train_norm.min())

A sample output

In [None]:
print(x_train_norm[0])

## Define Model

In [None]:
x_train.shape

In [None]:
x_train.shape[1]

In [None]:
input_shape = x_train.shape[1]
input_shape

The model can be created thi, for example with this layers:
- [input] ==> [hidden] ==> [output]:
  - 13 ==> [20] ==>  1

The **Input Layer** should be 13 (number of features) and the **Output Layer** shoub be 1 to match the target (y). The number of neurons at **Hidden layers** are arbitrary.

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Input(input_shape),
    tf.keras.layers.Dense(20, activation='relu'),
    tf.keras.layers.Dense(1)
    ])

model.summary()

Input layer has 13 conections, one for each feature [X]. Each feature goes to each one of the neurons of 1st Dense Layer, that has 20 Neurons. So, total parameters 1st Dense Layer will be ws=(13 x 20) + bs=20 ==> 280. The output layer will be only one Neuron that has one input from the output of previous layer (20 ) + 1 b ==> 21.

For simplicity, the input layer can be "merge with 1st layer"

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(20, 
                          activation='relu', 
                          input_shape = [13]),
    tf.keras.layers.Dense(1)
    ])

model.summary()

##Compile Model

### Type of errors
In statistics, `Mean Absolute Error (MAE)` is a measure of errors between paired observations expressing the same phenomenon. Examples of Y versus X include comparisons of predicted versus observed, subsequent time versus initial time, and one technique of measurement versus an alternative technique of measurement. MAE is calculated as:


$$MAE=\frac{1}{n}\sum_{i=1}^{n}(\left|y_{i}-\hat{y}_{i}\right|)$$


Another alternative to evaluate regression is the `Root Mean Square Error (RMSE)`.
This is the root of the  mean of the squared errors. It is a most popular measure of regression model's performance because also keep the same unit as y and larger errors are noted more than with MAE.

$$RSME=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{(y_{i}-\hat{y}_{i})}^2}$$

You can use MSE to calculate loss, but also tracking the MAE or RSME, once those values will have the "same order" of the Target (in the case, multiples of USD1,000).

The optimizer used is [ADAM](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam), a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments. The hyperparameter "Learning-Rate" used is the default ==> 0.001

In [40]:
model.compile(
    optimizer='adam',
    loss='mse',
    metrics=['mae'] # used to monitor the training and testing steps.
    )

## Train the model

In [41]:
history = model.fit(
    x_train_norm, 
    y_train,
    epochs=1000, 
    verbose=0
    )

Inspecting the model

In [None]:
train_eval = model.evaluate(x_train_norm, y_train)
print ("Training data MSE: {:.2}".format(train_eval[1]))

In [None]:
history.history.keys()

In [None]:
plt.plot(history.history['loss'], label='MSE')
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(loc='upper right')
#plt.ylim([0,50])
plt.show()

In [None]:
plt.plot(history.history['mae'], label='MAE')
plt.title('model loss')
plt.ylabel('loss in $K')
plt.xlabel('epoch')
plt.legend(loc='upper right')
#plt.ylim([0,50])
plt.show()

## Testing the trained model



In [None]:
test_eval = model.evaluate(x_test_norm, y_test)
print ("Test data MAE: {:.2}".format(test_eval[1]))

In [None]:
rsme = round(np.sqrt(test_eval[0]), 3)
rsme

The model has an RSME error of around USD4,000 and an MAE of around USD 2,600, what is very good for house's price estimation.  

Note: With features **not normalized**, we got loss (MSE): 22.0815; RSME: USD4,700 and  MAE: USD3,500

In [None]:
y_hat = model.predict(x_test_norm)
print(y_hat[:5]) # get the output predict values for the 5 first samples

In [None]:
y_test[:5] # get the output real known values for the 5 first samples

In [None]:
plt.hist(y_hat, label='predictions', color = 'b')
plt.hist(y_test, label = 'real values', color = 'r', alpha=0.5)
plt.xlabel('Price in K$')
plt.legend();

## Doing Inference

In [None]:
xt = np.array([1.1, 0., 9., 0., 0.6, 7., 92., 3.8 , 4., 300., 21., 200, 19.5])
xt.shape

In [None]:
x_train.shape

In [None]:
xt = np.reshape(xt, (1, 13))
xt.shape

In [None]:
xt

In [None]:
xt_norm = scaler.transform(xt)
xt_norm

In [None]:
yt = model.predict(xt_norm)
yt

In [None]:
xt = np.array([1.1, 0., 9., 0., 0.6, 7., 92., 3.8 , 4., 300., 21., 200, 19.5])
xt = np.reshape(xt, (1, 13))
xt_norm = scaler.transform(xt)
yt = model.predict(xt_norm)

## Finding the correct Hyperparameters
- [KerasTuner](https://keras.io/keras_tuner/)

KerasTuner is an easy-to-use, scalable hyperparameter optimization framework that solves the pain points of hyperparameter search.


In [None]:
!pip install keras-tuner --upgrade

In [3]:
import tensorflow as tf
import keras_tuner as kt

In [None]:
data = tf.keras.datasets.boston_housing
(x_train, y_train), (x_test, y_test) = data.load_data()

In [5]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

# first we fit the scaler on the training dataset
scaler.fit(x_train)

# then we call the transform method to scale both the training and testing data
x_train_norm = scaler.transform(x_train)
x_test_norm = scaler.transform(x_test)

Write a function that creates and returns a Keras model. Use the `hp` argument to define the hyperparameters during model creation.

In [6]:
def build_model(hp):
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Dense(
      hp.Choice('units', [10, 20, 30]),
      activation='relu'))
  
  model.add(tf.keras.layers.Dense(1))
  model.compile(optimizer='adam', loss='mse')
  return model

Initialize a tuner (here, RandomSearch). We use objective to specify the objective to select the best models, and we use max_trials to specify the number of different models to try.

In [7]:
tuner = kt.RandomSearch(
    build_model,
    objective='val_loss',
    max_trials=5)

Start the search and get the best model:

In [None]:
tuner.search(
    x_train_norm, y_train, 
    epochs=500, 
    validation_data=(x_test_norm, y_test))

best_model = tuner.get_best_models()[0]

In [None]:
tuner.search_space_summary()

In [None]:
tuner.results_summary()