In [1]:
import numpy as np

### Linear Regression

#### Univariate Linear regression

$y_n=\beta_xx_n+\epsilon_n$

$\epsilon_n=y_n-\beta_xx_n$ error is the difference between actual measurement and estimate

$E=\Sigma_n\epsilon_n^2$      -- total error by representing error size with squares because squares is easier to work with



$E=\Sigma_n(y_n-\beta_xx_n)^2$ 

The value of $\beta_x$ that minimizes $E$ is given by value of $\beta_x$ for which the derivative of $E$ with respect to $\beta_x$ is 0

$\frac{\delta E}{\delta{\beta}} = \Sigma_n (-2)(y_n-\beta_xx_n)x_n$

$0 = (-2)\Sigma_n (y_n-\beta_xx_n)x_n$

$0 = \Sigma_n (y_nx_n-\beta_xx_n^2)$

$0 = \Sigma_n (y_nx_n-\beta_xx_n^2)$

$\Sigma_n\beta_xx_n^2 = \Sigma_n y_nx_n$

$\beta_x = \frac{\Sigma_n y_nx_n}{\Sigma_nx_n^2}$



In [3]:
# Univariate linear regression
x = np.array([1, 2, 3, 4])                      
y = np.array([.5, 1.1, 1.4, 2.1])
beta = (x*y).sum() / (x**2).sum()
beta

0.51

In [6]:
# Writing the sum of corresponding products as a dot product in NumPy
beta = (x@y) / (x@x)
beta

0.51

#### Multivariate linear regression

$y_n=\beta_0+\beta_1x_n+\beta_2x_n^2+\epsilon_n$

$\beta = [\beta_0, \beta_1, \beta_2]^T$        -- $\beta $ is a vector containing all the $\beta_i$

X is a matrix where each row contains $[1,x_n,x_n^2]$

$y=X\beta+\epsilon$  -- X is a matrix, $\beta$ is a vector so each row 

$E=\Sigma_n\epsilon_n^2=\epsilon^T\epsilon$ where $\epsilon$ is a vector of $\epsilon_n$

$E=(y-X\beta)^T(y-X\beta)$

after taking the derivative of E with respect to $\beta$ and setting it to 0

$\beta=(X^TX)^{-1}(X^Ty)$


In [7]:
x = np.array([1, 2, 3, 4])
y = np.array([.5, 1.1, 1.4, 1])

In [9]:
X = np.array([
    [1, x[0], x[0]**2],
    [1, x[1], x[1]**2],
    [1, x[2], x[2]**2],
    [1, x[3], x[3]**2]
])

beta = np.linalg.inv(X.T@X)@(X.T@y)
beta

array([-0.7 ,  1.43, -0.25])