# Regression

## Linear Regression

Linear regression predicts the value $y$ by summing the products of corresponding weights $W$ and measures $X$,

$$y \equiv f(W,X) = w_0 + \sum_1^N{w_i x_i}$$

where $w_0$ is the $y$ intercept of $f$.

### Ordinary Least Squares

Ordinary least-squares regression minimizes the sum of the squares of the distances from each datum to each predicted value.

$$L = \sum_1^N (w_i x_i - y_i)^2$$

Let's try an example using the fit method of the scikit-learn LinearRegression module, which accepts arrays X and y and stores the weights in its coef_ attribute.

In [1]:
from sklearn import linear_model
reg = linear_model.LinearRegression()

In [2]:
reg.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 2])

LinearRegression()

In [3]:
reg.coef_

array([0.5, 0.5])

### Ridge Regression

Ridge regression penalizes coefficient size by subtracting the product of each weight and a corresponding ridge coefficient from each term of the loss function.

$$L = \sum_1^N (w_i x_i - y_i)^2 - (\alpha_i w_i)^2$$

The fit method of the scikit-learn ridge regression module works like that of the ordinary least squares one.

In [4]:
reg = linear_model.Ridge(alpha=.5)

In [5]:
reg.fit([[0, 0], [0, 0], [1, 1]], [0, .1, 1])

Ridge(alpha=0.5)

In [6]:
reg.coef_

array([0.34545455, 0.34545455])

In [7]:
reg.intercept_

0.1363636363636364

Ridge regression can be faster than ordinary least square or logistic regression for many classes because it computes the matrix $(X^T X)^{-1} X^T$ just once.

## Polynomial Regression

Polynomial regression extends linear regression to a polynomial function $\hat{f}$ of $W$ and $X$.

For example, fitting a paraboloid would entail 

$$\hat{y} \equiv \hat{f}(W, X) = w_0 + w_1 x_1 + w_2 x_2 + w_4 x_1 ^2 + w_5 x_2 ^2$$

To extend linear regression, we define 
$$z \equiv \{x_1, x_2, x_3, x_1 ^2, x_2 ^2\}$$  

We can then restate our polynomial $\hat{f}$ as a linear
$$f(W, X) = w_0 + w_1 z_1 + w_2 z_2 + w_3 z_3 + w_4 z_4 + w_5 z_5$$

In [8]:
from sklearn.preprocessing import PolynomialFeatures
import numpy as np

In [9]:
X = np.arange(6).reshape(3, 2)

In [10]:
X

array([[0, 1],
       [2, 3],
       [4, 5]])

In [11]:
poly = PolynomialFeatures(degree=2)

In [12]:
poly.fit_transform(X)

array([[ 1.,  0.,  1.,  0.,  0.,  1.],
       [ 1.,  2.,  3.,  4.,  6.,  9.],
       [ 1.,  4.,  5., 16., 20., 25.]])