# Solving the y = mx + b problem

Given 5 training data of (x, y) tuples: (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), 
solve for **m** and **b**.

- Forward pass prediction: $\hat{y} = mx + b$
- Loss: $0.5*(y - \hat{y})^2$

We attempt to solve the credit assignment problem:
- We increase the value of m and b and observe whether the loss increases or decreases.
- If the loss increases, then we decrement **m** or **b**
- Otherwise, we increment **m** or **b**



In [27]:
x_values = [2, 3, 4, 5, 6]
y_values = [3, 4, 5, 6, 7]

# initialize w and b to be 0
w, b = 0, 0

# Customize your parameters here
epochs = 1000
lr = 0.001
perturb = 0.001

# Perform the training loop
for epoch in range(epochs):
    for (x,y) in zip(x_values, y_values):
        y_pred = w*x + b
        loss = 0.5*(y - y_pred)**2
        
        # do perturbation to find out how much to change w and b
        w_perturb = w + perturb
        y_pred_perturb_w = w_perturb*x + b
        loss_perturb_w = 0.5*(y_pred_perturb_w - y)**2
        
        b_perturb = b + perturb
        y_pred_perturb_b = w*x + b_perturb
        loss_perturb_b = 0.5*(y_pred_perturb_b - y)**2
        
        # modify w and b accordingly
        if loss_perturb_w < loss:
            w += lr
        else:
            w -= lr
        
        if loss_perturb_b < loss:
            b += lr
        else:
            b -= lr
            
    if epoch%100 == 0:
        print(f"Epoch {epoch}: Loss: {loss}, w: {w}, b: {b}\n")
        
print(f"Equation is y = {w:.1f}x + {b:.1f}")      

Epoch 0: Loss: 24.304392000000004, w: 0.005, b: 0.005

Epoch 100: Loss: 6.027391999999992, w: 0.5050000000000003, b: 0.5050000000000003

Epoch 200: Loss: 9.860761315262648e-30, w: 0.9990000000000007, b: 0.9990000000000007

Epoch 300: Loss: 9.860761315262648e-30, w: 0.9990000000000007, b: 0.9990000000000007

Epoch 400: Loss: 9.860761315262648e-30, w: 0.9990000000000007, b: 0.9990000000000007

Epoch 500: Loss: 9.860761315262648e-30, w: 0.9990000000000007, b: 0.9990000000000007

Epoch 600: Loss: 9.860761315262648e-30, w: 0.9990000000000007, b: 0.9990000000000007

Epoch 700: Loss: 9.860761315262648e-30, w: 0.9990000000000007, b: 0.9990000000000007

Epoch 800: Loss: 9.860761315262648e-30, w: 0.9990000000000007, b: 0.9990000000000007

Epoch 900: Loss: 9.860761315262648e-30, w: 0.9990000000000007, b: 0.9990000000000007

Equation is y = 1.0x + 1.0


# Try to do it with backpropagation

Instead of using just a fixed learning rate value, we multiply it by the gradient with respect to the parameter.

In other words, we are implementing: 
- $w = w-lr*\frac{\partial L}{\partial w}$
- $b = b-lr*\frac{\partial L}{\partial b}$

Here, 

- $L = 0.5*(y - \hat{y})^2$

- $\hat{y} = wx + b$

- $\frac{\partial L}{\partial \hat{y}} = (y - \hat{y})(-1) = \hat{y} - y$

- $\frac{\partial \hat{y}}{\partial w} = x$

- $\frac{\partial \hat{y}}{\partial b} = 1$

Hence, $\frac{\partial L}{\partial w} = \frac{\partial L}{\partial \hat{y}}\frac{\partial \hat{y}}{\partial w} = (\hat{y})(-1)x$

$\frac{\partial L}{\partial b} = \frac{\partial L}{\partial \hat{y}}\frac{\partial \hat{y}}{\partial b} = (y - \hat{y})(1) = (y - \hat{y})$

In [37]:
x_values = [2, 3, 4, 5, 6]
y_values = [3, 4, 5, 6, 7]

# initialize w and b to be 0
w, b = 0, 0

# Customize your parameters here
epochs = 1000
lr = 0.001

# Perform the training loop
for epoch in range(epochs):
    for (x,y) in zip(x_values, y_values):
        y_pred = w*x + b
#         loss = 0.5*(y - y_pred)**2
        
        # use gradient descent to update w and b
        w += lr * (y-y_pred)*(x)
        b += lr * (y-y_pred)
            
    if epoch%100 == 0:
        print(f"Epoch {epoch}: Loss: {loss}, w: {w}, b: {b}\n")
        
print(f"Equation is y = {w:.1f}x + {b:.1f}")      

Epoch 0: Loss: 21.665097496865293, w: 0.10611744338133801, b: 0.024282829774723

Epoch 100: Loss: 0.02834757250261382, w: 1.1548209956639715, b: 0.3003714746028339

Epoch 200: Loss: 0.025584115471937192, w: 1.1468576141058933, b: 0.336688727618972

Epoch 300: Loss: 0.022998040375833806, w: 1.139237633242645, b: 0.371105891600474

Epoch 400: Loss: 0.020673362999295952, w: 1.1320130246723028, b: 0.4037372546866741

Epoch 500: Loss: 0.018583667595460446, w: 1.125163278613873, b: 0.434675477635787

Epoch 600: Loss: 0.01670520182470304, w: 1.1186689446155902, b: 0.4640084122337957

Epoch 700: Loss: 0.015016614270060393, w: 1.1125115814489146, b: 0.49181935191014

Epoch 800: Loss: 0.013498711748714864, w: 1.1066737047434116, b: 0.5181872682567613

Epoch 900: Loss: 0.012134241154359411, w: 1.1011387373383543, b: 0.5431870352748401

Equation is y = 1.1x + 0.6


# TensorFlow for Linear Regression

Now let us use TensorFlow for Linear Regression.

This is actually an advanced version of TensorFlow, as it uses GradientTape.

We do this so that you can see the underlying features of what TensorFlow is doing under the hood.

Next week, we will see that TensorFlow has many abstractions that can allow an easy training pipeline (without needing to code at this level, unless you want to become a researcher next time XD).

### You try it: Try changing the learning rate / epoch and see what happens

In [35]:
import tensorflow as tf
import numpy as np

x_values = [2, 3, 4, 5, 6]
y_values = [3, 4, 5, 6, 7]

# Initialize w and b to be 0
w = tf.Variable([0.0])
b = tf.Variable([0.0])

# Customize your parameters here
epochs = 1000
lr = 0.01

# Perform the training loop
for epoch in range(epochs):
    for (x,y) in zip(x_values, y_values):
        with tf.GradientTape() as tape:
            # do your model forward pass here
            y_pred = w*x + b

            # do loss function here
            loss = 0.5*(y - y_pred)**2

        grads = tape.gradient(loss, [w,b])
       
        w.assign_sub(lr * grads[0])
        b.assign_sub(lr * grads[1])
    
    if epoch%100 == 0:
        print(f"Epoch {epoch}: Loss: {loss[0]}, w: {w.numpy()[0]}, b: {b.numpy()[0]}\n")
        
print(f"Equation is y = {w.numpy()[0]:.1f}x + {b.numpy()[0]:.1f}")

Epoch 0: Loss: 6.254148960113525, w: 0.7640827894210815, b: 0.1873777210712433

Epoch 100: Loss: 0.009376347064971924, w: 1.082818865776062, b: 0.5893599390983582

Epoch 200: Loss: 0.0030395882204174995, w: 1.0471539497375488, b: 0.7661970257759094

Epoch 300: Loss: 0.0009853508090600371, w: 1.0268474817276, b: 0.8668816685676575

Epoch 400: Loss: 0.0003194306918885559, w: 1.0152859687805176, b: 0.9242074489593506

Epoch 500: Loss: 0.00010354965343140066, w: 1.008703351020813, b: 0.9568465948104858

Epoch 600: Loss: 3.3570569939911366e-05, w: 1.0049554109573364, b: 0.9754296541213989

Epoch 700: Loss: 1.0880636182264425e-05, w: 1.002821445465088, b: 0.9860103130340576

Epoch 800: Loss: 3.528389243001584e-06, w: 1.0016064643859863, b: 0.9920346140861511

Epoch 900: Loss: 1.14314855181874e-06, w: 1.0009146928787231, b: 0.9954645037651062

Equation is y = 1.0x + 1.0
