## General Linear Models (GLM)

A statistical model is a set of equations that relates predictors (called <i>independent variables</i>) to observations (called <i>dependent variable</i>).

As an example, in a model that predict the stock market price overtime, the independent variable is the time, and the dependent is the stock price

Setting up a GLM involves:
- defining an equation that relates the predictor variables to the dependent variable
- mapping the observed data onto the equations
- transforming the series of equations into a matrix equation
- solving that equation


| y | x | h |
|---|---|---|
|175|70|177|
|181|86|190|
|159|63|180|
|165|62|172|

solve the equations \
175 = β<sub>0</sub> + 70β<sub>1</sub> + 177 + β<sub>2</sub> \
...

the goal is to calculate β<sub>0</sub> and  β<sub>1</sub> and β<sub>2</sub>

In [1]:
import numpy as np

### Least Squares solution

Solve for the vector of unknown coefficient β means left-multiply both sides of the equation by the left-inverse of X, the design matrix.

The solution looks like this: 

Xβ = y

(X<sup>T</sup>X)<sup>-1</sup>X<sup>T</sup>Xβ = (X<sup>T</sup>X)<sup>-1</sup>X<sup>T</sup>y

β = (X<sup>T</sup>X)<sup>-1</sup>X<sup>T</sup>y

From a geometric perspective, that means that the goal is to find the set of coefficients β such that the weighted combination of columns in X minimizes the distance to data vector y. \
We can call that projection vector ϵ, and it represent the error (or residuals).\
To find β and ϵ we can apply orthogonal vector projection.

The key insight is that the shortest distance between y and X is given by the projection vector y - Xβ that meet X at the right angle (orthogonal).

That means that:

X@ϵ = 0
X @ (y - Xβ) = 0
X@y - X@Xβ = X@T

β = inv(X @ X) @ X@y

In [5]:
X = np.array([[1,70,177],[1,86,190],[1,63,180],[1,62,172]])
y = np.array([175,181,159,165])

X_leftinv = np.linalg.inv(X.T@X) @ X.T

betas = X_leftinv  @ y

print(betas)

[276.85604056   1.51653185  -1.18716219]
