## Fitting a linear regression model with TensorFlow

In this notebook you will see how to use TensorFlow to fit the parameters (slope and intercept) of a simple linear regression model via gradient descent (GD). 

**Dataset:** You work with the systolic blood pressure and age data of 33 American women, which is generated and visualized in the upper part of the notebook. 

**Content:**

* fit a linear model via the sklearn machine learning library of python to get the fitted values of the intercept and slope as reference. 
* use the TensorFlow library to fit the parameter of the simple linear model via GD with the objective to minimize the MSE loss. 
    * define the computational graph of the model
    * define the loss and the optimizer
    * visualize the computational graph in tensorboard
    * fit the model parameters via GD and check the current values of the estimated model parameters and the loss after each updatestep
    * verify that the estimated parameters converge to the values which you got from the sklearn fit.  


#### Imports

In [0]:
%tensorflow_version 2.x 

In [0]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline
plt.style.use('default')
from sklearn.linear_model import LinearRegression
%load_ext tensorboard


Here we read in the systolic blood pressure and the age of the 33 American women in our dataset. Then we use the sklearn library to find the optimal values for the slope a and the intercept b.

In [0]:
# Blood Pressure data
x = [22, 41, 52, 23, 41, 54, 24, 46, 56, 27, 47, 57, 28, 48, 58,  9, 
     49, 59, 30, 49, 63, 32, 50, 67, 33, 51, 71, 35, 51, 77, 40, 51, 81]
y = [131, 139, 128, 128, 171, 105, 116, 137, 145, 106, 111, 141, 114, 
     115, 153, 123, 133, 157, 117, 128, 155, 122, 183,
     176,  99, 130, 172, 121, 133, 178, 147, 144, 217] 
x = np.asarray(x, np.float32) 
y = np.asarray(y, np.float32)

In [0]:
plt.scatter(x=x,y=y)
plt.title("blood pressure vs age")
plt.xlabel("x (age)")
plt.ylabel("y (sbp)")

model = LinearRegression()
res = model.fit(x.reshape((len(x),1)), y)
predictions = model.predict(x.reshape((len(x),1)))
plt.plot(x, predictions)
plt.show()
print("intercept = ",res.intercept_,"solpe = ", res.coef_[0],)

## Tensorflow

We now use Tensorflow to define the computational graph then we will run the graph and automatically get the gradients of the loss w.r.t the variables (slope a  and intercept b) to update them.

In [0]:
# Defining the graph (construction phase)

@tf.function
def my_func(a_, b_):
  x_  = tf.constant(x, name='x_const')                     # Constants, these are fixed tensors holding the data values and cannot be changed by the optimization
  y_  = tf.constant(y, name='y_const')  

  y_hat_ = a_*x_ + b_                                      # we symbolically calculate y_hat    
  loss_ = tf.reduce_mean(tf.square(y_ - y_hat_))           # The final result, the MSE. Still symbolical
  return loss_

a_  = tf.Variable(0.0, name='a_var')                       # Variables, with starting values, will be optimized later
b_  = tf.Variable(139.0, name='b_var')                     # we name them so that they look nicer in the graph

logdir="linreg/"
writer = tf.summary.create_file_writer(logdir)
tf.summary.trace_on(graph=True, profiler=True)
z = my_func(a_, b_)   #needs one call to write the graph
with writer.as_default():
  tf.summary.trace_export( 
      name="linreg_tensorboard",
      step=0,
      profiler_outdir=logdir)
  

In [0]:
%tensorboard --logdir linreg

####Let's run the Graph and feed our start values for slope a and intercept b and fetch the mse loss


In [0]:
res_val =my_func(a_, b_)                        # Letting the variables a=0 b=139 flow through the graph
res_val
res_val.numpy()

Now we add an optimizer (gradient descent) to the graph and opimize the slope a and the intercept b. The start values are a=0 and b=139 (139 is the mean of the blood pressure and slope a=0 implies that the model predicts the mean for each age). We set a learning rate  and do 80000 updatesteps.

In [0]:
a_  = tf.Variable(0.0, name='a_var')                       # Variables, with starting values
b_  = tf.Variable(139.0, name='b_var')                     # 
optimizer = tf.keras.optimizers.Adam()                     # sgd optimizer with the learning rate
for i in range(80000): 
  with tf.GradientTape() as tape:
    loss = my_func(a_,b_)
    gradients = tape.gradient(loss, [a_,b_])
    optimizer.apply_gradients(zip(gradients,[a_,b_])) 
  if ((i==1)|(i==2)|(i==3)): 
        print("Epoch:",i, "slope=",a_.numpy(),"intercept=",b_.numpy(), "mse=", loss.numpy()) 
  if (i % 5000 == 0): 
        print("Epoch:",i, "slope=",a_.numpy(),"intercept=",b_.numpy(), "mse=", loss.numpy())

Let's look at the final values for the slope a and the intercept b. Form the sklearn solution we know that:

1.   optimal value for a:   1.1050216
2.   optimal value for b:   87.67143
3.   minimal loss:         349.200787168560



In [0]:
 print(a_.numpy(), b_.numpy(), loss.numpy())


#### Exercise

Compre the opitmal values from tensorflow with the ones from sklean.  
Do you get the same?    
Try to explain the differences and change the code to get the same results.  