# The QR Decomposition and Regression

In multiple regression, a single quantitative response variable is modeled as a linear combination of quantitative explanatory variables and error. There are n observations and p explanatory variables (including an intercept).

$$
y = b_0 + b_1 x_1 + ... + b_{p-1} x_{p-1} + error
$$

## The Matrix Formulation of Regression

The matrix expression of this, where each row corresponds to all measurements on a single individual is:
$$
y = Xb + error
$$
Letting $X^T$ represent the transpose of the matrix X, the normal equations are formed in this way.
$$
X^T y = X^T X b
$$
Notice that there are now exactly $p$ linear equations with $p$ unknowns. If the matrix $X$ is full rank ($p$ if $p < n$), then $X^T X$ will be invertible and the solution to the normal equations is
$$
(X^T X)^{-1} X^T y = b
$$
where b is the estimate of the parameters that minimizes the residual sum of squares. (A residual is the difference between the actual value of y and the value of y that is predicted by the model.

On the surface, it appears that this requires the explicit inversion of a matrix, which requires substantial computation. A better algorithm for regression is found by using the QR decomposition.

## The QR Decomposition

Here is the mathematical fact. If $X$ is an n by p matrix of full rank (say $n > p$ and the $rank = p$), then $X = QR$ where $Q$ is an n by p orthonormal matrix and $R$ is a p by p upper triangular matrix. Since $Q$ is orthonormal, $Q^T Q = I$, the identity matrix.
Beginning with the normal equations, see how the QR decomposition simplifies them.
$$
X^T X b = X^T y \\
(QR)^T (QR) b = (QR)^T y \\
R^T (Q^T Q) R b = R^T Q^T y \\
R^T R b = R^T Q^T y \\
(R^T)^{-1} R^T R b = (R^T)^{-1} R^T Q^T y \\
R b = Q^T y \\
\text{If we let } z = Q^T y, \\
R b = z \\
$$
This is simply an upper triangular system of equations which may be quickly solved by back substitution.

This algorithm will be efficient if the QR decomposition is fast. This algorithm will create the matrix $Q$ by overwriting X and create a new matrix R.
```
for j = 1 to p
{
	define r[j,j] = sqrt( sum_i x[i,j]^2 )  

# r[j,j] is the norm of the jth column of X

	for i = 1 to n
	{
		x[i,j] = x[i,j] / r[j,j]
	}

	for k = j+1 to p
	{
		r[j,k] = sum_{i=1}^n x[i,j]x[i,k]
		for i = 1 to n
		{
			x[i,k] = x[i,k] - x[i,j] r[j,k]
		}
	}
}
```
Last modified: April 25, 1997

Bret Larget, larget@mathcs.duq.edu

In [1]:
import numpy as np
import scipy
from scipy import linalg
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display,Latex,Math
%matplotlib inline

from IPython.core.interactiveshell import InteractiveShell
sh = InteractiveShell.instance()

def number_to_str(n,cut=5):
    ns=str(n)
    format_='{0:.'+str(cut)+'f}'
    if 'e' in ns or ('.' in ns and len(ns)>cut+1):
        return format_.format(n)
    else:
        return str(n)

def matrix_to_latex(mat,style='bmatrix'):
    if type(mat)==np.matrixlib.defmatrix.matrix:
        mat=mat.A
    head=r'\begin{'+style+'}'
    tail=r'\end{'+style+'}'
    if len(mat.shape)==1:
        body=r'\\'.join([str(el) for el in mat])
        return head+body+tail
    elif len(mat.shape)==2:
        lines=[]
        for row in mat:
            lines.append('&'.join([number_to_str(el)  for el in row])+r'\\')
        s=head+' '.join(lines)+tail
        return s
    return None

sh.display_formatter.formatters['text/latex'].type_printers[np.ndarray]=matrix_to_latex

np.array([[1,2],[3,4]])

array([[1, 2],
       [3, 4]])

In [3]:
A=np.array(np.arange(9).reshape(3,3))
A

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [8]:
def show_decomposition(*args):
    latex=''
    for arg in args:
        if type(arg)==str:
            latex+=arg
        else:
            latex+=matrix_to_latex(arg)
    latex='$'+latex+'$'
    display(Math(latex))

Q,R=np.linalg.qr(A)
show_decomposition(A,'=',Q,R)

<IPython.core.display.Math object>

In [11]:
A=np.array(np.arange(40).reshape(10,4))
A

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31],
       [32, 33, 34, 35],
       [36, 37, 38, 39]])

In [12]:
Q,R=np.linalg.qr(A)
show_decomposition(A,'=',Q,R)

<IPython.core.display.Math object>

In [16]:
Q[:,:-3].dot(R[:-3,:])

array([[ -0.        ,  -0.        ,  -0.        ,  -0.        ],
       [  4.        ,   4.15789474,   4.31578947,   4.47368421],
       [  8.        ,   8.31578947,   8.63157895,   8.94736842],
       [ 12.        ,  12.47368421,  12.94736842,  13.42105263],
       [ 16.        ,  16.63157895,  17.26315789,  17.89473684],
       [ 20.        ,  20.78947368,  21.57894737,  22.36842105],
       [ 24.        ,  24.94736842,  25.89473684,  26.84210526],
       [ 28.        ,  29.10526316,  30.21052632,  31.31578947],
       [ 32.        ,  33.26315789,  34.52631579,  35.78947368],
       [ 36.        ,  37.42105263,  38.84210526,  40.26315789]])