# MATH 210 March 13, 2017

Agenda
* Applications to linear algebra
* Linear regression

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg as la
from numpy.linalg import matrix_power as mpow
%matplotlib inline

## Least Squares Regression

Given a collection of points $(x_0,y_0), (x_1, y_1), \dots, (x_N,y,_N)$ we would like to find the coefficients $a$ and $b$ such that $y=ax+b$ passes through the points in an optimal way. This means we want to minimise the sum of squares errors:

$$
SSE = \sum_{i=0}^N (y_i -(a+bx_i))^2
$$

If we form matrices $X$ and $A$ given by

$$
\begin{bmatrix}
1 & x_0 \\
1 & x_1 \\
\vdots & \vdots \\
1 & x_N
\end{bmatrix}
A = 
\begin{bmatrix}
a \\
b
\end{bmatrix}
Y=
\begin{bmatrix}
y_0 \\
y_1 \\
\vdots \\
\end{bmatrix}
$$

Then the coeffficients of $a$ and $b$ which minimizes the sum of squared errros $SSE$, is the solution of

$$
(X^T X)A = (X^T)Y
$$

### Example
Retrieving the a and b values

In [None]:
a = 2
b = 3
N = 100
x = np.random.rand(100)
noise = 0.1*np.random.rand(100)
y = a + b*x + noise

plt.scatter(x,y);

To build the Matrix X, we can use the function `numpy.hstack` , but first we have to reshape the array $x$ from a 1D array of shape `(N,)` to a 2D NumPy array of shape `(N,1)`

In [None]:
X = np.hstack((np.ones(N).reshape(N,1),x.reshape(N,1)))
X[:5,:]
# :5 means up to to row number 5 and : means every column
Y = y.reshape(N,1)
X[:5,:]

Use `scipy.linalg.solve` to solve $(X^T X)A = (X^T)Y$

In [None]:
A = la.solve((X.T @ X), (X.T @ Y))
A

Lets plot the random datapoints with the linear regression we computed

In [None]:
u = np.linspace(0,1,10)
v = A[0,0] + A[1,0]*u
plt.plot(u,v, 'r')
plt.scatter(x,y);

### Linear regression for quadratic models

The same idea works for fitting a quadratic model $y = a + bx + cx^2$ to a set of data points.

$$
\begin{bmatrix}
1 & x_0 & x_0^2\\
1 & x_1 & x_1 ^2\\
\vdots & \vdots & \vdots \\
1 & x_N
\end{bmatrix}
A = 
\begin{bmatrix}
a \\
b \\
c
\end{bmatrix}
Y=
\begin{bmatrix}
y_0 \\
y_1 \\
\vdots \\
\end{bmatrix}
$$

In [None]:
a = 3
b = 5
c = 8
N = 1000
x = 2*np.random.rand(N) - 1 # first its creating numbers from 1 to 2 then subtracting makes that [-1,1]
noise = np.random.randn(N)
y = a + b*x + c*x**2 + noise
plt.scatter(x,y, alpha=0.5);

In [None]:
X = np.hstack((np.ones(N).reshape(N,1), x.reshape(N,1), (x**2).reshape(N,1)))
Y = y.reshape(N,1)

In [None]:
A = la.solve((X.T @ X), (X.T @ Y))
A

In [None]:
u = np.linspace(-1,1,20)
v = A[0,0] + A[1,0]*u + A[2,0]*u**2
plt.plot(u,v, 'r')
plt.scatter(x,y);