# Objective
 The objective is to use linear regression to find the median value of owner-occupied homes in 1000 USD's.
 We will build a Machine learning model (i.e. Linear Regression) using tensorflow.keras
 
 Data : https://raw.githubusercontent.com/dphi-official/Datasets/master/Boston_Housing/Training_set_boston.csv

In [95]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

In [96]:
df = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/Boston_Housing/Training_set_boston.csv')
df.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,MEDV
0,15.0234,0.0,18.1,0.0,0.614,5.304,97.3,2.1007,24.0,666.0,20.2,349.48,24.91,12.0
1,0.62739,0.0,8.14,0.0,0.538,5.834,56.5,4.4986,4.0,307.0,21.0,395.62,8.47,19.9
2,0.03466,35.0,6.06,0.0,0.4379,6.031,23.3,6.6407,1.0,304.0,16.9,362.25,7.83,19.4
3,7.05042,0.0,18.1,0.0,0.614,6.103,85.1,2.0218,24.0,666.0,20.2,2.52,23.29,13.4
4,0.7258,0.0,8.14,0.0,0.538,5.727,69.5,3.7965,4.0,307.0,21.0,390.95,11.28,18.2


### Separating input and output features

In [97]:
X = df.drop('MEDV', axis = 1)
y = df.MEDV

### Splitting dataset into train and test data

In [98]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 29)

In [99]:
# no of input features
n_features = X.shape[1]
n_features

13

### Training model

#### 1) Defining model

In [100]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from numpy.random import seed
import tensorflow

In [101]:
# defining model
model = Sequential()
model.add(Dense(10, activation = 'relu', input_shape = (n_features,)))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(1))      
#Note that the visible layer of the network is defined by the “input_shape” argument on the first hidden layer.

#### 2) Compiling model

In [102]:
# import RMSprop optimimzer
from tensorflow.keras.optimizers import RMSprop
optimizer = RMSprop(0.01)                            # 0.01 is the learning rate

In [103]:
model.compile(loss = 'mean_squared_error', optimizer = optimizer)       # compling the model

#### 3) Fitting model

In [104]:
seed_value = 29
seed(seed_value)        # If you build the model with given parameters, set_random_seed will help you produce the same result on multiple execution

# Recommended by Keras -------------------------------------------------------------------------------------
# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)

# 3. Set `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)

# 4. Set the `tensorflow` pseudo-random generator at a fixed value
tensorflow.random.set_seed(seed_value) 


model.fit(X_train, y_train, epochs = 10, batch_size = 30, verbose = 1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x25cb2924b80>

By setting verbose 0, 1 or 2 you just say how do you want to 'see' the training progress for each epoch.

verbose=0 will show you nothing (silent)

verbose=1 will show you an animated progress bar like this:

progres_bar

verbose=2 will just mention the number of epoch like this:

verbose = 2

#### 4) Evaluating model

In [105]:
model.evaluate(X_test, y_test)



73.30943298339844

The mean squared error we got here is 116.66. Now, what does it mean?

When you subtract the predicted values (of X_test data) from the acutal value (of X_test data), then square it and sum all the squares, and finally take a mean (i.e. average) of it, the result you will get is 116.66 in this case.

evaluate() does this task automatically. If you want to get the prediciton for X_test you can do model.predict(X_test)

### Hyperparameter tuning

#### 1) Learning rate

In [106]:
# defining model
model = Sequential()
model.add(Dense(10, activation = 'relu', input_shape = (n_features,)))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(1))

# compiling model
optimizer = RMSprop(0.1)
model.compile(loss = 'mean_squared_error', optimizer = optimizer)

# fitting model
model.fit(X_train, y_train, epochs = 10, batch_size = 30, verbose = 1)

# evaluating model
print('The MSE value is:', model.evaluate(X_test, y_test))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
The MSE value is: 280.5528259277344


In [107]:
# playing with different learning rates
learning_rate = 0.05
epochs = 10
optimizer = RMSprop(learning_rate)
model.compile(loss = 'mean_squared_error', optimizer = optimizer)
model.fit(X_train, y_train, epochs = epochs, batch_size = 30)
model.evaluate(X_test, y_test)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


156.0909423828125

#### 2) Epochs

In [108]:
model = Sequential()
model.add(Dense(10, activation = 'relu', input_shape = (n_features,)))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(1))

optimizer = RMSprop(0.01)

model.compile(loss = 'mean_squared_error', optimizer = optimizer)
model.fit(X_train, y_train, epochs = 100, batch_size = 30, verbose = 1)
model.evaluate(X_test, y_test)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

27.412565231323242

In [109]:
# Playing with epochs and learning rate
model = Sequential()
model.add(Dense(10, activation = 'relu', input_shape = (n_features,)))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(1))
learning_rate = 0.01
epochs = 100
optimizer = RMSprop(learning_rate)

model.compile(loss = 'mean_squared_error', optimizer = optimizer)
model.fit(X_train, y_train, epochs = epochs, batch_size = 30)
model.evaluate(X_test, y_test)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

41.66178512573242

#### 3) Batch Size

In [110]:
model = Sequential()
model.add(Dense(10, activation = 'relu', input_shape = (n_features,)))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(1))

optimizer = RMSprop(0.1)

model.compile(loss = 'mean_squared_error', optimizer = optimizer)
model.fit(X_train, y_train, epochs = 10, batch_size = 40, verbose = 1)
model.evaluate(X_test, y_test)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


315.5292663574219

In [111]:
# playing with batch size
learning_rate = 0.01
epochs = 100
batch_size = 40

optimizer = RMSprop(learning_rate)

model.compile(loss = 'mean_squared_error', optimizer = optimizer)
model.fit(X_train, y_train, epochs = epochs, batch_size = batch_size, verbose = 1)
model.evaluate(X_test, y_test)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

129.19168090820312

#### Summary of Hyper-parameter tuning
1)Training loss should steadily decrease, steeply at first, and then more slowly until the slope of the curve reaches or approaches zero.

2)If the training loss does not converge, train for more epochs.

3)If the training loss decreases too slowly, increase the learning rate. Note that setting the learning rate too high may also prevent training loss from converging.

4)If the training loss varies wildly (that is, the training loss jumps around), decrease the learning rate.

5)Lowering the learning rate while increasing the number of epochs or the batch size is often a good combination.

6)Setting the batch size to a very small batch number can also cause instability. First, try large batch size values. Then, decrease the batch size until you see degradation.

7)For real-world datasets consisting of a very large number of examples, the entire dataset might not fit into memory. In such cases, you'll need to reduce the batch size to enable a batch to fit into memory.

### Implementing Hyperparameter tuning using Sklearn

We can automate the hyperparameter tunning using GridSearCV

Implementing GridSearchCV with Sklearn using following steps:

1) Define the general architecture of the model

2) Define the hyperparameters grid to be validated

3) Run the GridSearchCV process

4) Print the results of the best model

 We will integrate Sklearn and Keras properly, by (a) creating a create_model function that allows to create the model in an automated way, and (b) defining a KerasRegressor model which is an implementation of the scikit-learn regressor API for Keras.

In [112]:
# Import GridSearchCV and Keras Regressor
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasRegressor

# 1) Define model through a user defined function
def create_model(optimizer = RMSprop(0.01)):
    model = Sequential()
    model.add(Dense(10, activation = 'relu', input_shape = (n_features,)))
    model.add(Dense(8, activation = 'relu', input_shape = (n_features,)))
    model.add(Dense(1))
    model.compile(loss = 'mean_squared_error', optimizer = optimizer)
    return model
model = KerasRegressor(build_fn = create_model, verbose = 1)

# 2) Define the hyperparameters grid to be validated
batch_size = [10, 20, 30, 40, 60, 80, 100]
epochs = [10, 50, 100]
param_grid = dict(batch_size = batch_size, nb_epoch = epochs)
model = KerasRegressor(build_fn = create_model, verbose = 1)
grid = GridSearchCV(estimator = model, param_grid = param_grid, cv = 5, n_jobs = -1)

# 3) Run GridSearchCV process
grid_result = grid.fit(X_train, y_train, verbose = 1)

# 4) Print the results of the best model
print('Best params:' +  str(grid_result.best_params_))



  model = KerasRegressor(build_fn = create_model, verbose = 1)
  model = KerasRegressor(build_fn = create_model, verbose = 1)


Best params:{'batch_size': 10, 'nb_epoch': 100}


In [113]:
# import cross validation evaluator
from sklearn.model_selection import cross_val_score

# Measure model's performance
results = cross_val_score(grid.best_estimator_, X_test, y_test, cv = 5)
print('Results: \n * Mean:', -results.mean(), '\n * Std:', results.std())

Results: 
 * Mean: 175.93637084960938 
 * Std: 49.54020963747385


### Implementing Hyperparameter tuning with Keras
We can automate hyperparameter tuning using Random Search and Keras  (prefer this over 1st method that uses sklearn)

Implementing Random Search with Keras using following steps:

1) Install and import all the packages needed

2) Define the general architecture of the model through a creation function

3) Define the hyperparameters grid to be validated

4) Run the GridSearchCV process

5) Print the results of the best model

To execute the hyperparameter tuning procedure we will use the keras-tuner, a library that helps you pick the optimal set of hyperparameters for your TensorFlow model.

In [114]:
# 0) Install and import all the packages needed

! pip install -q -U keras-tuner
import kerastuner as kt

# 1) Define model through a user defined function
def model_builder(hp):
    model = Sequential()
    model.add(Dense(10, activation = 'relu', input_shape = (n_features,)))
    model.add(Dense(8, activation = 'relu'))
    model.add(Dense(1))
    hp_learning_rate = hp.Choice('learning_rate', values = [1e-1, 1e-2, 1e-3, 1e-4])
    optimizer = RMSprop(learning_rate = hp_learning_rate)
    model.compile(loss = 'mse', metrics = ['mse'], optimizer = optimizer)
    return model

# 2) Define the hyperparameters grid to be validated
tuner_rs = kt.RandomSearch(
                 model_builder,                        # Takes hyperparameters (hp) and returns a Model instance
                objective = 'mse',                     # Name of model metric to minimize or maximize
                 seed = 29,                            # Random seed for replication purposes
                max_trials = 5,                        # Total number of trials (model configurations) to test at most.
                 directory = 'random_search')          # Path to the working directory (relative)  

# 3) Run the GridSearchCV process
tuner_rs.search(X_train, y_train, epochs = 10, validation_split = 0.2, verbose = 1)

INFO:tensorflow:Reloading Oracle from existing project random_search\untitled_project\oracle.json
INFO:tensorflow:Reloading Tuner from random_search\untitled_project\tuner0.json
INFO:tensorflow:Oracle triggered exit


In [115]:
tuner_rs.results_summary()

Results summary
Results in random_search\untitled_project
Showing 10 best trials
<keras_tuner.engine.objective.Objective object at 0x0000025CBE359910>
Trial summary
Hyperparameters:
learning_rate: 0.01
Score: 60.36199188232422
Trial summary
Hyperparameters:
learning_rate: 0.001
Score: 69.35105895996094
Trial summary
Hyperparameters:
learning_rate: 0.1
Score: 113.26669311523438
Trial summary
Hyperparameters:
learning_rate: 0.0001
Score: 25066.525390625


In [116]:
# Print the results of the best model
best_model = tuner_rs.get_best_models(num_models = 1)[0]
best_model.evaluate(X_test, y_test)



[62.41484451293945, 62.41484451293945]

In [117]:
best_model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 10)                140       
                                                                 
 dense_1 (Dense)             (None, 8)                 88        
                                                                 
 dense_2 (Dense)             (None, 1)                 9         
                                                                 
Total params: 237
Trainable params: 237
Non-trainable params: 0
_________________________________________________________________


#### 5) Making prediction

New test  Data : https://raw.githubusercontent.com/dphi-official/Datasets/master/Boston_Housing/Testing_set_boston.csv

In [118]:
new_test_data = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/Boston_Housing/Testing_set_boston.csv')
new_test_data.head()


Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT
0,0.09178,0.0,4.05,0.0,0.51,6.416,84.1,2.6463,5.0,296.0,16.6,395.5,9.04
1,0.05644,40.0,6.41,1.0,0.447,6.758,32.9,4.0776,4.0,254.0,17.6,396.9,3.53
2,0.10574,0.0,27.74,0.0,0.609,5.983,98.8,1.8681,4.0,711.0,20.1,390.11,18.07
3,0.09164,0.0,10.81,0.0,0.413,6.065,7.8,5.2873,4.0,305.0,19.2,390.91,5.52
4,5.09017,0.0,18.1,0.0,0.713,6.297,91.8,2.3682,24.0,666.0,20.2,385.09,17.27


In [None]:
model.predict(new_test_data)