![@mikegchambers](../../images/header.png)

# Linear Regression / Stochastic Gradient Descent

In this notebook, we explore Linear Regression using scikit-learn.

![Linear](linear.png)

In [None]:
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import SGDRegressor

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from matplotlib import style
style.use('ggplot') or plt.style.use('ggplot')

In [None]:
from sklearn import datasets

# The Data

In [None]:
X, y = datasets.make_regression(100, 1, noise=5, bias=0)

In [None]:
axes = plt.axes()

axes.scatter(x=X, y=y, c='green')
    
plt.show()

# The brute force 'by-hand' way

Let's use our own code to find a model using the 'calculate all the values' 'brute force method'.  Its 'slow' but it will get there-ish.

First we create a list of all the gradients we wan to try out.

In [None]:
grad_test = np.arange(0, 150, 10)
print(grad_test)

Now, let's plot them on a graph so we can see what they all look like.

In [None]:
axes = plt.axes()

X_test = np.linspace(X.min(),X.max())

axes.scatter(x=X, y=y, c='green', zorder=1000)

for grad in grad_test:
    grad_y = grad*X_test
    axes.plot(X_test, grad_y)
    
axes.set_xlim(X.min(),X.max())
axes.set_ylim(y.min(),y.max())

plt.title('All the grads to test.')

plt.show()

Let's define our own, super simple loss function

In [None]:
def simple_ssr_loss(grad, X, y):
    
    # Create a var to keep track of Sum of Square Residuals
    ssr = 0
    
    # Combine X and y data together, makes it easier to loop through
    data = np.stack((X[:,0],y), axis=1)
    
    # Loop through all the data and calculate the distance between the 
    # data point and the line at that point on the X axis.
    for d in data:
        line_point = d[0]*grad
        diff = line_point - d[1]
        
        # And add this to the Sum of Square Residuals
        ssr = ssr + (diff * diff)
        
    # All done.  Send the result back.
    return ssr

For each of our test gradients, lets calculate a loss.

First let's make some places to store what we find:

In [None]:
square_line = []
min_ssr = 1e100
min_grad = 0

Now lets perform the brute force bit:

In [None]:
for grad in grad_test:

    ssr = simple_ssr_loss(grad, X, y)

    square_line.append(ssr)
    if ssr < min_ssr:
        min_ssr = ssr
        min_grad = grad

Now we get to plot the results.  We should get a nice curve, and as we have them, we can also plot some lines to show the minima, thats the bit we were looking for.

In [None]:
axes = plt.axes()

plt.plot(grad_test,square_line)

plt.axvline(min_grad, c='green')
plt.axhline(min_ssr, c='green')

plt.ylabel('SSR')
plt.xlabel('Gradient')

plt.show()

And to be clear, let's print the value of the gradient we found to fit best:

In [None]:
min_grad

Finally, for this brute force atempt, lets show the line we found, with the original data.  How does it look?

In [None]:
axes = plt.axes()

# Plot the data points:
axes.scatter(x=X, y=y, c="green")

# Get the slope and the x intercept of the model line:
grad = min_grad

# Plot the line (remember y=mx+c?):
x_line = np.linspace(X.min(),X.max())
y_line = grad*x_line
axes.plot(x_line, y_line)

plt.title('Model Line by Brute Force')

plt.show()

# The scikit-learn way

Ok, now lets do it a way more effient way, using scikit-learn's LinearRegression / Stochastic Gradient Descent.  Much fewer lines for us to write, faster to run, and usually more accurate too.

In [None]:
model = LinearRegression()
# model = SGDRegressor()

In [None]:
model.fit(X,y)

Done!  Now let's take a look

In [None]:
axes = plt.axes()

# Plot the data points:
axes.scatter(x=X, y=y)

# axes.set_xlim(X.min(),X.max())
# axes.set_ylim(y.min(),y.max())

# Get the slope and the x intercept of the model line:
slope = model.coef_[0]
intercept = model.intercept_

# Plot the line (remember y=mx+c?):
x_line = np.linspace(X.min(),X.max())
y_line = slope*x_line+intercept
axes.plot(x_line, y_line, 'red')

plt.title('Model Line by scikit-learn')


plt.show()

And to be clear, let's print the value of the gradient scikit-learn found to fit best.  How different is it from our manual attempt?  

In [None]:
slope