# <font style="color:rgb(50,120,229)">Linear Regression using Keras</font>

In this chapter, we will show an example of using a Neural Network for predicting housing prices. The same problem can be solved using a technique called **Linear Regression**. But, we will see how we can use a simple network to perform the same task.

But before going into that, let's look at what Linear Regression is and the problem we want to solve.

# <font style="color:rgb(50,120,229)">What is Linear regression?</font>
<img src="https://www.learnopencv.com/wp-content/uploads/2018/02/cv4faces-mod10-ch2-linreg-example.png" width="600">
Linear regression is a linear approach to model the relationship between two variables. For example, the values on the x axis are independent variables ( normaly referred to as Samples ), and the values on y axis are dependent variables ( also known as Target). In the figure above, there are 5 points. We want to find a straight line which will minimize the sum of all errors ( shown by arrows in the above figure ). We want to find the slope of the line with the least error. Once, we are able to model the given data points, we can predict the value on y axis, for a new point on x axis.

We will learn how to create a simple network with a single layer to perform linear regression. We will use the [**Boston Housing dataset**](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html) available in Keras as an example. Samples contain 13 attributes of houses at different locations around the Boston suburbs in the late 1970s. Some example attributes are: average number of rooms, crime rate etc. You can find the complete list of attributes [**here**](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html). 

The 13 attributes become our 13-dimensional independent variable. Targets are the median values of the houses at a location (in k$). With the 13 features, we have to train the model which would predict the price of the house in the test data.

A schematic diagram of the network we want to create is given below

<img src="https://www.learnopencv.com/wp-content/uploads/2019/12/regression-keras-schema.png" width="400">

# <font style="color:rgb(50,120,229)">Training</font>
The purpose of training is to find the weights (w0 to w12) and bias (b) for which the network produces the correct output by looking at the input data. We say that the network is trained when the error between the predicted output and ground truth becomes very low and does not decrease further. We can then use these weights to predict the output for any new data.

The network consists just one neuron. We use the Sequential model to create the network graph. Then we add a Dense layer with the number of inputs equal to the number of features in the data (13 in this case) and a single output. Then we follow the workflow as explained in the previous section, i.e. We compile the model and train it using the `fit` method. 

All keras datasets come with a `load_data()` function which returns tuples of training and testing data as shown in the code.

In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.datasets import boston_housing
 
(X_train, Y_train), (X_test, Y_test) = boston_housing.load_data()

print("Training set size: ", X_train.shape)
print("Test set size: ", X_test.shape)
print("Training example features: ", X_train[0,:])
print("Training example output: ", Y_train[0])

nFeatures = X_train.shape[1]

model = Sequential()
model.add(Dense(1, input_shape=(nFeatures,), activation='linear'))
 
model.compile(optimizer='rmsprop', loss='mse', metrics=['mse', 'mae'])

Training set size:  (404, 13)
Test set size:  (102, 13)
Training example features:  [  1.23247   0.        8.14      0.        0.538     6.142    91.7
   3.9769    4.      307.       21.      396.9      18.72   ]
Training example output:  15.2


In [2]:
import tensorflow as tf
print(tf.__version__)

2.0.0


The output of `model.summary()` is given below. It shows 14 parameters - 13 parameters for the weights and 1 for the bias.

In [3]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 1)                 14        
Total params: 14
Trainable params: 14
Non-trainable params: 0
_________________________________________________________________


In [4]:
# To see detail output, change verbose to True
model.fit(X_train, Y_train, batch_size=4, epochs=1000, verbose=False)

<tensorflow.python.keras.callbacks.History at 0x7f00e8799e50>

# <font style="color:rgb(50,120,229)">Inference</font>
After the model has been trained, we want to perform inference on the test data. We can find the loss on the test data using the `model.evaluate()` function.

In [5]:
# To see detail output, change verbose to True
# returns loss, metrics as speficfied in compilation step so it returns mse, mse and mae.
model.evaluate(X_test, Y_test, verbose=False)

[24.101493536257276, 24.101494, 3.5698843]

We get the predictions on test data using the `model.predict()` function. Here we compare the ground truth values with the predictions from our model for the first 5 test samples.

In [6]:
Y_pred = model.predict(X_test)
 
print(Y_test[:5])
print(Y_pred[:5,0])

[ 7.2 18.8 19.  27.  22.2]
[ 6.7474775 17.898502  21.201483  28.910046  23.517996 ]


It can be seen that the predictions follow the ground truth values, but there are some errors in the predictions.

The major take-away from this introduction is the simple Keras Workflow for creating and training Neural networks. There might be some more pre and post processing steps involved, depending on the problem at hand. But, the core process remains the same.