# Multiple Linear regression without Scikit Learn

There's nothing better to understand the gradient descent algorithm than to code it from scratch. What? you have heard this before ? This time we are switching to gradient descent for multiple linear regression!

Don't hesitate to come back to your Machine Learning course on linear regression to refresh your memory. 

Our goal will be to code a multiple linear regression such as : 

$f(x) = \beta \times x + \beta_0 = \beta_1 \times x_1 + \dots + \beta_p \times x_p + \beta_0$

* Import the following libraries: 
  * Numpy 

In [11]:
import numpy as np 

* Define a `Model` class that will take two methods: 
  1. `__init__(self, data)`, where `data` will be the dataset containing the training variables. It's the class builder which will allow you to define an attribute $\beta_0$ (`beta_0` in your code) and an attribute $\beta$ (`beta` in your code). These attributes represent the coefficients/parameters of the model an we will be initialize them randomly using Numpy (cf: `np.random.randn`).
`beta` will have to contain a number of random values equal to the number of training variables.
  2. `__call__(self, x)`, a special method that will turn our class into a callable which will return $\beta \times x + \beta_0$ when called. 
  
  ⚠️ we are now working with matrices and vectors, therefore you will need to use operations that work for these objects ⚠️

In [12]:
class Model():
  def __init__(self,data):
    np.random.seed(42)
    feature_num = data.shape[1]
    self.beta = np.random.randn(feature_num)
    self.beta_0 = np.random.randn(1)
  
  def __call__(self, x):
    return self.beta @ x.transpose() + self.beta_0

* Import `sklearn.datasets`
  * Use the `load_diabetes()` function to load the diebetes dataset in an object called `diabetes`.
  * Print the `DESCR` attribute of the diabetes object
  * Save the content of the `data` attribute in an object named `diabetes_data`
  * Save the content of the `target` attribute in an object named `y`

In [13]:
from sklearn import datasets, linear_model

# Load the diabetes dataset
diabetes = datasets.load_diabetes()
# print(diabetes.DESCR)
diabetes_data = diabetes.data
y = diabetes.target

.. _diabetes_dataset:

Diabetes dataset
----------------

Ten baseline variables, age, sex, body mass index, average blood
pressure, and six blood serum measurements were obtained for each of n =
442 diabetes patients, as well as the response of interest, a
quantitative measure of disease progression one year after baseline.

**Data Set Characteristics:**

  :Number of Instances: 442

  :Number of Attributes: First 10 columns are numeric predictive values

  :Target: Column 11 is a quantitative measure of disease progression one year after baseline

  :Attribute Information:
      - age     age in years
      - sex
      - bmi     body mass index
      - bp      average blood pressure
      - s1      tc, total serum cholesterol
      - s2      ldl, low-density lipoproteins
      - s3      hdl, high-density lipoproteins
      - s4      tch, total cholesterol / HDL
      - s5      ltg, possibly log of serum triglycerides level
      - s6      glu, blood sugar level

Note: Each of these 1

* Create an instance of your class `Model` and display `beta_0` and `beta`

In [14]:
model = Model(diabetes_data)

In [15]:
model.beta_0

array([-0.46341769])

In [16]:
model.beta

array([ 0.49671415, -0.1382643 ,  0.64768854,  1.52302986, -0.23415337,
       -0.23413696,  1.57921282,  0.76743473, -0.46947439,  0.54256004])

* Try doing a first "regression" by running `model(diabetes_data[0,:])`. 
NB: If you don't have the same values as this notebook in output, this is normal since you have initialized your values randomly. 

In [17]:
model(diabetes_data[0,:])

array([-0.44918067])

In [18]:
model(diabetes_data)

array([-0.44918067, -0.45589502, -0.45771637, -0.61984002, -0.44881756,
       -0.550924  , -0.55308199, -0.31717212, -0.50309643, -0.58730845,
       -0.57130368, -0.54082745, -0.44635102, -0.48387345, -0.40174933,
       -0.45171479, -0.27968973, -0.41108348, -0.57390173, -0.53677669,
       -0.57906058, -0.49211392, -0.49011248, -0.37743941, -0.51895877,
       -0.66365533, -0.55673145, -0.45351227, -0.61043403, -0.28764662,
       -0.47033211, -0.58379631, -0.37631679, -0.30774322, -0.57850968,
       -0.38171861, -0.36152731, -0.51942106, -0.31431692, -0.60903986,
       -0.43776635, -0.66967106, -0.60605254, -0.38472769, -0.47960819,
       -0.46721094, -0.58847992, -0.70037719, -0.39755505, -0.5734833 ,
       -0.42387308, -0.42767024, -0.47684303, -0.30892654, -0.38646623,
       -0.41907887, -0.55687358, -0.52939462, -0.21300692, -0.42595264,
       -0.49266141, -0.63212634, -0.45536384, -0.65455112, -0.50557262,
       -0.46546158, -0.58509762, -0.5022422 , -0.52900267, -0.47

* This value corresponds to a random prediction of your model. But we don't have any data yet. This time, let's use `sklearn` to import data. 

* Visualize `y` against the predictions using `plotly`.

In [52]:
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scatter(x=y, y=model(diabetes_data),
                    mode='markers',
                    name = "target vs predictions"))
fig.add_trace(go.Scatter(x=y,y=y,
              mode="lines",
              name = "perfect prediction line"))
fig.update_layout(
    title="Target vs Predictions",
    xaxis_title="target",
    yaxis_title="predictions"
    )
fig.show()

* Now we need to define a cost function. For a linear regression, we could use MSE : 

`np.mean((model(input) - y)**2)`

  * Create a function which we'll call `mse` (for mean square error). This function will take two arguments `y_pred` & `y_true`.

In [53]:
def mse(y_pred, y_true):
  # return (np.sum((y_pred - y_true)**2))/len(y_pred) alternate solution
  return np.mean((y_pred - y_true)**2) # it is important to use numpy functions
  # since they can be applied easily on numpy arrays (or array like objects)
  # and they compute very fast compared to pure python operations

* Test your function by inserting `model(diabetes_data)` & `y` as arguments. 
* Calculate the rmse as well

In [54]:
print("MSE : ",mse(model(diabetes_data), y))
print("RMSE : ",np.sqrt(mse(model(diabetes_data), y)))

MSE :  28772.791166881092
RMSE :  169.62544374851637


* We're going to need to compute the gradients for our variable `model.beta` and our constant `model.beta_0`. To do this, we're going to need to review our derivative formulas. Since we're not here to do math, we're going to give you these formulas. 
  * `derive_model_beta = 2/len(y_pred)*(x.transpose() @ (y_pred - y_true))`
  * `derive_model_beta_0 = 2/len(y_pred)*(np.sum(y_pred - y_true))`

  * Feel free to read this article if you want to know more about the calculation of the derivative: [Gradient Descent Derivation](https://mccormickml.com/2014/03/04/gradient-descent-derivation/)


  * So using the above formulas, code the first function `derivative_mse_beta` that will take the arguments: 
    * `x` --> the values for your variable / `y_pred` --> the values predicted by your model / `y_true` --> the values of the target variable


In [55]:
# Calculate model.beta's derivate
def derivative_mse_beta(y_pred, y_true, x):
  return 2/len(y_pred)*(x.transpose() @ (y_pred - y_true))
  # return 2/len(y_pred) * np.sum(np.dot(x,(y_pred-y_true)))

* Test you function

In [56]:
derivative_mse_beta(model(diabetes_data), y, diabetes_data)

array([-1.37696342, -0.30868695, -4.30771757, -3.24242072, -1.55942054,
       -1.27887065,  2.89074699, -3.1552449 , -4.15188071, -2.80993586])

* So using the above formulas, now code the `derivative_mse_beta_0` function which will take the arguments :
    * `y_pred` --> the values predicted by your model / `y_true` --> the actual values to predict

In [57]:
# Calculate model.b's derivate
def derivative_mse_beta_0(y_pred, y_true):
  return 2/len(y_pred)*(np.sum(y_pred - y_true))

* Test you function

In [58]:
derivative_mse_beta_0(model(diabetes_data), y)

-302.19045426675444

* We will try to see if we can minimize our cost function using the two gradients above. To update our variables, we need to subtract their respective gradients. Ex: 
  * `param = param - learning_rate * gradient`

  * Set a `learning_rate` to 0.1
  * Try to apply your formula on `model.beta` and `model.beta_0`.

In [59]:
lr = 0.1

print("OLD model.a = {}".format(model.beta))
print("OLD model.b = {}".format(model.beta_0))

model.beta -= lr * derivative_mse_beta(model(diabetes_data), y, diabetes_data)
model.beta_0 -= lr * derivative_mse_beta_0(model(diabetes_data), y)

print("NEW model.a = {}".format(model.beta))
print("NEW model.b = {}".format(model.beta_0))

OLD model.a = [ 0.78232202  1.75352754 -2.11562934 -1.24564542 -1.09323446  0.28803622
 -0.20466382  0.77552423 -0.01479812 -1.16570008]
OLD model.b = [1.03825703]
NEW model.a = [ 0.92001836  1.78439623 -1.68485758 -0.92140335 -0.93729241  0.41592328
 -0.49373852  1.09104872  0.40038995 -0.8847065 ]
NEW model.b = [31.25730246]


We see that the values of the two parameters have changed, let's see how it affected the predictions of the model. 
Visualize y vs the model's predictions.

In [60]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=y, y=model(diabetes_data),
                    mode='markers',
                    name = "target vs predictions"))
fig.add_trace(go.Scatter(x=y,y=y,
              mode="lines",
              name = "perfect prediction line"))
fig.update_layout(
    title="Target vs Predictions",
    xaxis_title="target",
    yaxis_title="predictions"
    )
fig.show()

We notice the predictions got a little closer to our real data
* Recalculate your MSE

In [61]:
mse(model(diabetes_data), y)

20546.223770820954

* Our MSE has dropped a lot! This is good news but the process of gradient descent is iterative. So you'll have to do it several times before arriving at accurate predictions. 
  * By making a loop, try to repeat the process from above 10,000 times. 
  * Display every 1000 epochs: mse, model.beta & model.beta_0 

In [62]:
# Define learning rate and a number of iterations 
lr = 0.1
epochs = 10000

In [63]:
model = Model(diabetes_data)
for epoch in range(epochs):
  # Calculate the loss function
  current_loss = mse(model(diabetes_data), y)
  
  # Update variables
  model.beta -= lr * derivative_mse_beta(model(diabetes_data), y, diabetes_data)
  model.beta_0 -= lr * derivative_mse_beta_0(model(diabetes_data), y)

  # Show updated variables
  if epoch % 100 == 0 or epoch == epochs - 1:
    print("-------------------- Epoch {} --------------------".format(epoch))
    print("Current Loss: {}".format(current_loss))
    print("beta_1 = {}".format(model.beta))
    print("beta_0 = {}".format(model.beta_0))

-------------------- Epoch 0 --------------------
Current Loss: 29360.389657030777
beta_1 = [ 1.23987661 -0.33415013  0.60670116  2.33147567  0.82901638 -0.77647656
  0.27622141  0.20704351 -1.53888191  1.83686001]
beta_0 = [29.6699059]
-------------------- Epoch 100 --------------------
Current Loss: 5257.978755978736
beta_1 = [ 13.46785382   1.68991292  40.96657338  32.38790684  14.31710459
   9.94421097 -26.48083564  28.82180142  37.04490011  27.38954017]
beta_0 = [152.13348414]
-------------------- Epoch 200 --------------------
Current Loss: 4769.976885110974
beta_1 = [ 23.1386294    1.7676583   76.73465809  58.55601882  24.2824642
  17.16490294 -49.34479863  52.27665883  70.52577468  48.71637592]
beta_0 = [152.13348416]
-------------------- Epoch 300 --------------------
Current Loss: 4409.924647434647
beta_1 = [ 30.69635385   0.27883898 108.57795174  81.42610926  31.38865522
  21.56340157 -68.92607255  71.45049815  99.70388209  66.49972816]
beta_0 = [152.13348416]
--------------

* Using `plotly`, view your model and actual values again

In [64]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=y, y=model(diabetes_data),
                    mode='markers',
                    name = "target vs predictions"))
fig.add_trace(go.Scatter(x=y,y=y,
              mode="lines",
              name = "perfect prediction line"))
fig.update_layout(
    title="Target vs Predictions",
    xaxis_title="target",
    yaxis_title="predictions"
    )
fig.show()

**We've got a nice regression this time!** 