# Deriving Multiple Linear Regression

In this notebook, we demonstrate how to derive the least squares solution for multiple linear regression. 

## Derivation
Suppose you want to predict the vector $y$ using variables $x_1, x_2, \dots, x_n$ stored as the columns of the design matrix $X$. This suggests the following model $ y = X\beta + e $ where $\beta$ denotes a vector of coefficients and $e$ denotes a vector of error terms not explained by the rest of the model (i.e., the residuals). Our goal is to determine the $\beta$ values that minimize the sum of squared error. This approach is often called [*least squares*](https://en.wikipedia.org/wiki/Least_squares).
* The error is simply $e = Xb$.
* We want to find the $\beta$ that minimizes $e \cdot e = e'e = (y-X\beta)'(y-X\beta) = y'y - 2b'X'y + b'X'Xb$. 
* To mimimize, we take the derivative and set it equal to zero: $\frac{\partial(e \cdot e)}{\partial b} = -2X'y + 2X'Xb = 0$
* Solving for $\beta$ gives $X'Xb=X'y$ and finally $b=(X'X)^{-1}X'y$.

Therefore, when trying to predict $y$ by $\hat{y} = X\hat{\beta}$, our estimate for $\beta$ should be $\hat{\beta} = (X'X)^{-1}X'y$.

## Application

In [None]:
import scalation.linalgebra._
val x = new MatrixD((8, 2), 1, 1.1, 2, 2.2, 3, 3.3, 4, 4.4, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.8)

In [None]:
val actual = VectorD(2, 3)

In [None]:
val rng   = scalation.random.Normal(0, 0.01)           // random number generator
val noise = VectorD(for (i <- x.range1) yield rng.gen) // make some noise
val y     = (x * actual) + noise                       // make noisy response vector

In [None]:
val b = (x.t * x).inverse * x.t * y
val e = y - x * b

In [None]:
val sse = e dot e