# 1.3 Linear Regression

## 1.3.1 QR Decomposition

QR Decomposition can be used to solve the linear least squares problem and is the basis for the QR algorithm. The $Q$ is an orthogonal matrix, and $R$ is an upper trianglar matrix:

\begin{align}
QQ^T = I \;\;\;\;\;\;\;\;\;\; Q^T = Q^{-1} \;\;\;\;\;\;\;\;\;\;
R = 
\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n}\\
0 & a_{22} & \cdots & a_{2n}\\
\vdots & \vdots & \ddots & \vdots\\
0 & 0 & \cdots & a_{nn}\\
\end{bmatrix}\;\;\;\;\;\\ \\
\end{align}

QR Decomposition follows the form $A = QR$, and in some cases is easier seen in the form $A^T = Q^T R^T$. 

## 1.3.2 Least-Squares Problems

Let $A \in \mathbb{R}^{n*m}$ be an $n*m$ matrix, and $b \in \mathbb{R}^n$ be a vector. The system $Ax=b$ is consistant if there exists a set $K$ of n-tuple solutions $s$, such that $s = (s_1, \ldots, s_n) \in \mathbb{R}^n$ satisfying $As = b$. If the solution set $K$ is empty, the system is inconsistant (i.e. if $K \neq \emptyset$, the system is consistent, if $K = \emptyset$ the system is inconsistent). When $n=m$, we can use the inverse $A^{-1}$ to solve the system; however, when $n > m$, $A^{-1}$ does not exist. We can still solve this by finding the solution to the following least squares problem:

\begin{gather*}
\min_{x \in \mathbb{R}^m} || Ax-b|| \\
\text{The solution to this satisfies:}
\\
A^TAx=A^Tb \text{ and } Rx^*=Q^Tb, \text{ where } x^* \text{ is the solution that can be found using back substitution.}
\end{gather*}

## 1.3.3 Linear Regression

Given a set of input data points $\{(x_i,y_i)\}^n_{i=1}$, where $x_i=(x_{i1},\ldots,x_{id})^T$ is a column vector, we can find an affine function (a function thats composed of a linear function plus a constant) to fit the data (i.e. an affine function that is the line of best fit). This can be done by finding some coefficients $\beta_j$ that minimizes the following:

\begin{gather*}
\sum^n_{i=1}(y_i-\hat{y_i})^2 \\
\hat{y_i} = \beta_0+\sum^d_{j=1}\beta_jx_{ij}
\end{gather*}

This can be viewed as the predicted values of the linear model with coefficients $\beta_j$. The problem can be transformed to the following least-squares problem:
\begin{gather*}
\min_\beta ||y-A\beta||^2
\end{gather*}