As you have gone through the theory of Linear Regression,this notebook will help you visualize what is happening behing the scene in linear Regression for 1 Feature Vector. I have described how you can implement Linear Regression through the mathematical formula ( that is way faster) and through Gradient Descent. 



 <span style="font-size:25px;"> <font color='blue'><b> Importing important Libraries <hr></b></font> </span>

In [1]:
import numpy as np # Numpy can help make mathematical computations faster. 
#In case you wonder how fast it makes the computation, go ahead and try these using lists directly.

<span style="font-size:25px;"> <font color='blue'><b> Creating an array with points of a Straight Line
 <hr></b></font> </span>

In [2]:
# I have created points of Line y = 10x + 5
# Later, I will be predicting this line later.

X = np.array([i for i in range(0,100)])
y = np.array([i*10 + 5 for i in X])

In [3]:
## Plot of X and y
import plotly.graph_objs as go
import plotly.offline as pyo

# Create the trace
trace = go.Scatter(x=X, y=y, mode='markers')

# Create the layout
layout = go.Layout(title='Plot of X and y', xaxis=dict(title='X'), yaxis=dict(title='y'))

# Combine the trace and layout
fig = go.Figure(data=[trace], layout=layout)

# Plot the graph
pyo.iplot(fig)


<span style="font-size:25px;"> <font color='blue'><b> Using Mathematical Formula
 <hr></b></font> </span>


The mathematical interpretation is covered in [this ](https://towardsdatascience.com/linear-regression-from-scratch-cd0dee067f72)blog. Just have a look and make sure you understand the concepts behind.<br><b>Note: </b> Root mean square error (RMSE) is a metric for evaluating regression models. For now rmse is sufficient for evaluating the model as well and you may skip the R2 metric, covered in the blog, which will be covered later in the course.
<script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>
</head>
<body>
    <p>The formula for RMSE is:</p>
    <p>$$\text{RMSE} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}$$</p>
    where, <p>$$ \hat{y}_i \ is \ the \ predicted \ value \ corresponding \ to \ X_i \\ and \ y_i \ is \ the \ real \ value$$</p>
</body>

In [4]:
##Implementaion through direct method. (Solution of Optimization Result)

# Calculate the mean of the given sample space

X_mean = np.mean(X) 
y_mean = np.mean(y)

n = len(X)

numerator = 0
denominator = 0
for i in range(n):

    numerator += (X[i] - X_mean) * (y[i] - y_mean) # Calculate difference between each point and mean
    denominator += (X[i] - X_mean) ** 2 # Calculating the square of difference.

b1 = numerator / denominator
b0 = y_mean - (b1*X_mean)

print(b1,b0)

y_pred_m = X*b1 + b0 

10.0 5.0


In [5]:
import plotly.graph_objs as go
import plotly.offline as pyo

# Create the trace
trace = go.Scatter(x=X, y=y_pred_m, mode='markers')

# Create the layout
layout = go.Layout(title='Plot of X and y', xaxis=dict(title='X'), yaxis=dict(title='y'))

# Combine the trace and layout
fig = go.Figure(data=[trace], layout=layout)

# Plot the graph
pyo.iplot(fig)

<span style="font-size:25px;"> <font color='blue'><b> Implementation through Gradient Descent



In [6]:
## Implementation using Gradient Descent
## Setting up the intial parameters

weights = 0
bias = 0
alpha = 0.0001 ## Learning Rate
epochs = 500000 
## We will be converging the line as y = weights*X + bias

def prediction(weights,bias,X):
     return weights*X + bias

def gradient(X,y,y_pred,n):
    error = (1/n) * sum((y - y_pred) ** 2) ## This error is MSE 
    dw = (2/n) * sum((y - y_pred)*X) ## Partial derivative of error with respect to w, using y_pred = weights(w) * X + bias(b)
    db = (2/n) * sum(y - y_pred) ## Partial derivative of error with respect to b, using y_pred = weights(w) * X + bias(b)
    return -dw,-db


for i in range(epochs):
    y_pred = prediction(weights,bias,X)
    dw,db = gradient(X,y,y_pred,n)
    weights = weights - alpha*dw ## Gradient Descent
    bias = bias - alpha*db ## Gradient Descent
print(weights,bias)
y_pred = X*weights + bias

10.000000000000703 4.99999999995343


In [7]:
import plotly.graph_objs as go
import plotly.offline as pyo

# Create the trace
trace = go.Scatter(x=X, y=y_pred, mode='markers')

# Create the layout
layout = go.Layout(title='Plot of X and y', xaxis=dict(title='X'), yaxis=dict(title='y'))

# Combine the trace and layout
fig = go.Figure(data=[trace], layout=layout)

# Plot the graph
pyo.iplot(fig)

<span style="font-size:25px;"> <font color='blue'><b> Making comparision of the three results.


You have seen that mathematical formula gave the exact result while gradient descent gave approximately correct result. Sometimes we have no other option but go for gradient descent. you will see these later in the course

In [8]:
# As you can expect, all the three lines will coincide
# Create traces for each of the lines
trace1 = go.Scatter(x=X, y=y, mode='lines', name='Line 1')
trace2 = go.Scatter(x=X, y=y_pred_m, mode='lines', name='Line 2')
trace3 = go.Scatter(x=X, y=y_pred, mode='lines', name='Line 3')

# Combine the traces and create the layout
data = [trace1, trace2, trace3]
layout = go.Layout(title='Plot of Three Lines', xaxis=dict(title='X'), yaxis=dict(title='Y'))

# Create a figure and plot the graph
fig = go.Figure(data=data, layout=layout)
pyo.iplot(fig)
