# Linear Regression

One of the most widely used models for regression is known as linear regression. This **asserts
that the response is a linear function of the inputs**. This can be written as follows:
![](img/f1.png)
where $\mathbf{w}^{T}{\mathbf{x}}$ represents the inner or **scalar product** between the input vector $\mathsf{x}$ and the model’s weight vector $\mathsf{w}$ , and $\epsilon$ is the **residual error** between our linear predictions and the true response.
We often assume that $\epsilon$ has a Gaussian or normal distribution. We denote this by $\epsilon\sim\mathcal{N}(\mu,\sigma^2)$ where $\mu$ is the mean and $\sigma^2$ is the variance. When we plot this distribution, we get the well-known bell curve (this is Gaussian pdf with mean 0 and variance 1):
![A Gaussian pdf with mean 0 and variance 1](img/f2.png)
To make the connection between linear regression and Gaussians more explicit, we can rewrite the model in the following form:
![](img/f3.png)

## Algorithm

In [27]:
# NumPy
import numpy as np

# Plotting
import time
from matplotlib import pyplot as plt
from IPython import display
%matplotlib inline

from abc import ABCMeta, abstractmethod

Let's implements the simplest version of Linear Regfression algorithm, based on ordinary list squares method 

In [33]:
class LinearRegression():
        @abstractmethod
        def __init__(self):
            # Takes no arguments, for now
            self.X = None
            self.y = None
            self.coef_ = None
            self.intercept_ = None
            pass
        
        @abstractmethod
        def fit(self, X_train, y_train):
            self.X = np.array(X_train)
            self.y = np.array(y_train)
            # One step:
            self.coef_ = np.dot(np.linalg.pinv(self.X), self.y)      

In [41]:
X = [[1, 1], [1, 2], [2, 1], [4, 4], [4.5, 4.5], [6.5, 1], [6.2, 2.3], [6, 2]]
y = [0, 0, 0, 1, 1, 2, 2, 2]

lr1 = LinearRegression()
lr1.fit(X, y)
print(lr1.coef_)

import sklearn.linear_model
lr2 = sklearn.linear_model.LinearRegression()
lr2.fit(X, y)
print(lr2.coef_)


[ 0.34734284 -0.12574883]
[ 0.39961271 -0.04130053]


In [29]:
# Pseudo-inverse matrix
print(np.linalg.pinv(X))

[[ 3. -2.]
 [-1.  1.]]
