## Fitting a linear regression model with TensorFlow

In this notebook you will see how to use TensorFlow to fit the parameters (slope and intercept) of a simple linear regression model via gradient descent (GD). 

**Dataset:** You work with the systolic blood pressure and age data of 33 American women, which is generated and visualized in the upper part of the notebook. 

**Content:**

* fit a linear model via the sklearn machine learning library of python to get the fitted values of the intercept and slope as reference. 
* use the TensorFlow library to fit the parameter of the simple linear model via GD with the objective to minimize the MSE loss. 
    * define the computational graph of the model
    * define the loss and the optimizer
    * visualize the computational graph in tensorboard
    * fit the model parameters via GD and check the current values of the estimated model parameters and the loss after each updatestep
    * verify that the estimated parameters converge to the values which you got from the sklearn fit.  


#### Imports

In [4]:
import torch
import numpy as np

# Data
x_data = np.array([22, 41, 52, 23, 41, 54, 24, 46, 56, 27, 47, 57, 28, 48, 58, 9,
                   49, 59, 30, 49, 63, 32, 50, 67, 33, 51, 71, 35, 51, 77, 40, 51, 81], np.float32)
y_data = np.array([131, 139, 128, 128, 171, 105, 116, 137, 145, 106, 111, 141, 114,
                   115, 153, 123, 133, 157, 117, 128, 155, 122, 183, 176, 99, 130, 172, 
                   121, 133, 178, 147, 144, 217], np.float32)

x = torch.tensor(x_data, requires_grad=False)
y = torch.tensor(y_data, requires_grad=False)

# Variables
a = torch.tensor([0.0], requires_grad=True)  # slope
b = torch.tensor([139.0], requires_grad=True)  # intercept

# Optimizer
optimizer = torch.optim.SGD([a, b], lr=0.0001)

# Training loop
for i in range(500_000):
    optimizer.zero_grad()  # Reset gradients to zero; necessary before backprop
    y_hat = a * x + b  # Model prediction
    loss = torch.mean((y - y_hat) ** 2)  # Mean squared error
    
    loss.backward()  # Compute gradients
    optimizer.step()  # Update parameters
    
    if i in [1, 2, 3] or i % 5000 == 0:
        print(f"Epoch: {i}, slope: {a.item()}, intercept: {b.item()}, mse: {loss.item()}")


Epoch: 0, slope: 0.05530909448862076, intercept: 138.9999237060547, mse: 673.4545288085938
Epoch: 1, slope: 0.08415231853723526, intercept: 138.9993438720703, mse: 650.182373046875
Epoch: 2, slope: 0.0991988405585289, intercept: 138.9984893798828, mse: 643.8486328125
Epoch: 3, slope: 0.10705311596393585, intercept: 138.99749755859375, mse: 642.1177978515625
Epoch: 5000, slope: 0.2193760722875595, intercept: 133.61346435546875, mse: 583.372802734375
Epoch: 10000, slope: 0.31234320998191833, intercept: 128.79087829589844, mse: 536.7905883789062
Epoch: 15000, slope: 0.39555972814559937, intercept: 124.47410583496094, mse: 499.4712219238281
Epoch: 20000, slope: 0.4700349271297455, intercept: 120.61077117919922, mse: 469.5780029296875
Epoch: 25000, slope: 0.5366916060447693, intercept: 117.15301513671875, mse: 445.6317138671875
Epoch: 30000, slope: 0.596350371837616, intercept: 114.05827331542969, mse: 426.4493408203125
Epoch: 35000, slope: 0.649747908115387, intercept: 111.28832244873047, 